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DEVELOPMENT OF ADVANCED ACREAGE ESTIMATION METHODS 


INTRODUCTION 

A practical application of remote sensing which is of considerable interest 
is the use of satellite-acquired (LANDSAT) mul ti spectral scanner (MSS) data to 
conduct an inventory of some crop of economic interest such as wheat over a 
large geographical area. Any such inventory requires the development of accurate 
and efficient algorithms for analyzing the structure of the data. The use of 
multi -images (several registered passes over the ^ame area during the growing 
season) increases the dimension of the measurement space. As a result, charac- 
terization of the data structure is a formidable task for an unaided analyst. 

Cluster analysis has been used extensively as a scientific tool to generate 
hypotheses about structure of data sets. Sometimes one can reduce a large data 
set to a relatively small data set by the appropriate grouping of elenents 
using cluster analysis. In some cases, the algorithm which effects the grouping 
becomes the basis for actual classification. In other cases, the cluster 
analysis produces groupings of the data which in turn serve as a starting point 
for other algorithms which produce acreage estimates. Additional uses of 
cluster analysis arise in conjunction with dimensionality reduction techniques 
which are used to generate displays for purposes of further interactive analysis 
of th^ data structure. 

Work carried out under this contract dealt with algorithm development, 
theoretical investigations, and empirical studies. The algorithm development 
tasks centered around the refinement of the AWEBA clustering/classification 
algorithm, and its subsequent use as a starting point for HISSE, a maximum 
likelihood proportion estimation procedure. Theoretical results were obtained 
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which form a basis for the maxir.'um llkehood estimation procedures. In addition, 
some Investigations were made Into the use of the Akalke Information criterion 
(AIC) when applied to mixture models. Additional work was concerned with the 
development of a preliminary research plan which delineates some of the 
technical Issues and associated tasks in the area of rice scene radiation 
characterization. 

Specifically, investigations were carried out in the following areas; 
Refinements and Documentation of the AMOEBA Clustering Algorithm 
Rice Scene Radiation Characterization Applied Research 
Spectral “Spatial Classification Algorithm Development 
Use &f the Akaike Information Criterion 
Each of these investigations is discussed in turn in the sequel. 
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1 . REFINEMENTS AND DOCUMENTATION OF THE AMOEBA CLUSTERING ALGORITHM 

Detailed documentation of the AMOEBA clustering/classification algorithm 
for the version implemented on the HP-3000 System at EROS Data Center appears 
in an attached report entitled: 

Jack Bryant, System support documentation— I DIMS FUNCTION— AMOEBA, 
Department of Mathematics, Texas A&M University, March, 1982. 

Included throughout the documentation are comments which indicate where code 
changes could or should be made to transport the program to another system. 
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2. RICE SCENE RADIATION CHARACTERIZATION APPLIED RESEARCH 


Work for this task was performed by Dr. James Heilman, Remote Sensing 
Center, Texas A&M University. The results of his investigations are presented 
in the attached report entitled: 

James Heilman, Rice Scene Radiation Research Plan, Remote Sensing 
Center, Texas A4M University, December, 1981. 
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3. SPECTRAL-SPATIAL CLASSIFICATION ALGORITHM DEVELOPMENT 


The objective of this study was to formulate and test algorithms based 
on a likelihood function which respected the integrity of some predetermined 
structure in the data. 

For purposes of these investigations, the "pure field data" (patches) 
determined by the AMOEBA algorithm were used as the predetermined structure. 

A maximum likelihood parameter estimation procedure (HISSE) was designed to 
respect (take into account) field integrity. 

A mathematical description and implementation of the procedure, along 
with results from preliminary tests, appear in the report: 

Charles Peters and Frank Kampe, Numerical trials of HISSE, Contract 
NAS-9-14689, SR-H0-00477, Department of Mathematics, University of 
Houston, August, 1980. 

Theoretical results underlying the aporoach used in the HISSE algorithm 
are discussed in the report: 

Charles Peters, On the existence, uniqueness, and aymptotic normality 
of a consistent solution of the likelihood equations for nonidentically 
distributed observations— applications to missing data problems. 

Contract NAS-9-14C89, SR-HO-00492, Department of Mathematics, University 
of Houston, September, 1980. 

Additional theoretical results were obtained which address the conver- 
gence of a particular iterative form of the likelihood equations in the 
case of a mixture of densities from (possibly distinct) exponential families. 
These results appear in the report: 
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Richard A. Redner, An iterative procedure for obtaining maximum 
likelihood estimates in a mixture model. Contract NAS-9-14689, 
SR-Tl-0481, Division of Mathematical Sciences, University of Tulsa, 
September, 1980. 

Use of a modification of the HISSE model for the case of pure LANDSAT 
agricultural data sets are discussed in the attached report: 

Charles Peters, On possible modifications of the HISSE model for pure 
agricultural data. Contract NAS-9-14689, SR-Hl -04037, Department 
of Mathematics, University of Houston, February, 1981. 
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4. USE OF THE AKAIKE INFORMATION CRITERION 

The objective of this study was to investigate the application of the 
Akaike Information Criterion (AIC) to a mixture model. In particular, inves- 
tigations were carried out concerning the use of the AIC in selecting the 
number of components of a mixture model. The results of these investigations 
are discussed in the attached report: 

Richard A. Redner, The Akaike information criterion and its application 
to mixture proportion estimation, Contract NAS-9-14689, SR-Tl -04207, 
Division of Mathematical Sciences, University of Tulsa, November, 1981. 
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ABSTRACT 

AMOEBA is a clustering program based on a spatial -spectral model for 
image data. It is fast and automatic (in the sense that no parameters are 
required), and classifies each picture element into classes which are 
determined internally. As an IDIMS function, no limit on the size of 
the image is imposed. 
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1. INTRODUCTION 

AMOEBA is a clustering program designed during the Large Area Crop 
Inventory Experiment in 1977-78. The original idea* was developed in an 
agricultural setting (large fields, few real classes). It was a nice 
surprise to discover that the program solves other problems. It has an 
uncanny ability to discover structure in image data, at least when the 
structure exists. Because of the nature of the method, it operates 
efficiently on small (16 bit) computers lacking floating point hardware. 
In some sense, the smaller the computer the better it works. 

Bugs? Before going on, we report a problem experienced at the EROS 
Data Center (EDC), a U.S. Geological Survey installation at Sioux Falls, 
South Dakota. Unquestionably correct FORTRAN source code produces non- 
sensical results. The bug is easily demonstrated, and may be related to 
the file management system of IDIMS. It is not encountered unless huge 
images with many bands are being processed. We do not know whether it 
is FORTRAN, IDIMS, the operating system, or local hardware. We do know 
it is not a problem in the source code. The source code is believed to 
be without error. Scores of hours with the Hewlett-Packard IDIMS-DEBU6 
utility only prove the program is lost. We welcome suggestions of any 
kind whatever which may indicate what is wrong. Fortunately, the bug 
shows up as a simple failure with meaningless cluster centers, or a 
bounds violation where, according to the source, none is possible. That 
is, it seems unlikely that the bug causes real damage since the user 
will be informed of garbage answers. A cynical systems programmer could 
suggest that a disappointed user use ISOCLS instead. A diligent one 
would find the bug, whatever it is. We are neither. We are only ex- 
hausted, and wish you luck if you look for it. 

The Program . The idea underlying the program is easy to state. A 
full description is given in Appendix A. Here we sketch the idea. Our 
goal is to sort the pixels of an image into classes that will show an 


♦Jack Bryant, On the clustering of multidimensional pictorial data. 
Pattern Recognition 11, pp. 115-125, 1979. 
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analyst the structure of the image. Suppose one has two partitions of a 
set into a family of disjoint subsets. A measure of the distance between 
the partitions is the probability that points are clustered alike in the 
first and differently in the second, plus the probability they are 
classified alike in the second and differently in the first. Using a 
boundary finding algorithm, we can extract samples from the data we 
believe are alike. These will reside in spatially connected patches in 
the complement of the boundary. By ordering samples on sotre one-dimensional 
attribute, we are able to find some we believe are likely to be in different 
real classes. The samples, called test pixels , come in test sets of five 
each, and are used to evaluate our clustering. Test set means form 
starting cluster centers . 

The number of clusterings of an image is astronomical. Rather than 
evaluate all clusterings, AMOEBA successively eliminates cluster centers 
which are involved in nearest neighbor assignments that split pairs from 
the same test set, or gather pairs from widely separated test pixels. 
Clusters are never combined, chained, or split. They are merely eliminated, 
starting with the set of test set means, ant* ending with a set of clusters 
between the user-supplied maximum and mi’^. imum numbers. 

There are a nun*er of general features about the program which have 
nothing to do with the clusteri ig and classification method, but which 
do make the program harder to understand (harder, that is, than the big 
system original version in which data was assumed to be all in memory at 
once). Before we start detailed documentation, we comment on some of 
the trickier details which apply to more than one component. Nomenclature 
for the following: 

COUNT -- a counter of the number of elements in each class. 

ND — the dimensionality or number of bands. 

REJECT -- a vector of thresholds used to check classification 
based on a spatial mixture model. 

Wide Image Logic . There is no limit to the size of input image 
which can be processed (other than disk storage). Yet there is a severe 
limitation on data storage: subroutines START and CLASSIFY each require 

three lines of data storage and three lines of labels. About 20,000 words 
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are available, so the maximum width is about 20,000/((ND'>*3)*3) . If ND is 
4, for example, fewer than 1000 samples per line can be processed. There- 
fore, the program segments the image into strips of width NC, with the 
actual width NZ being passed to various subroutines. 

The Mask. In IDIMS, imagery is organized in rectangular arrays; 
however, the image itself is often not rectangular. The value 0 is usually 
stored in each band or channel of the ''Mask''. In AMOEBA, a logical flag 
MASK (optional parameter, default .TRUE.) is used to tell the program 
whether a value 0 in channel 1 is to be used as a mask. If set, no 
processing is wasted on these pixels. They are labelled with the label 
99 and counted in COUNT(IOO). If some are found, their count is printed 
at the conclusion of the program. 

Four Neighbors . Each pixel inside an image with rectangular organi- 
zation has exactly four nearest neighbors. There are concepts for which 
more than the four neighbors can be considered. Discrete connectedness, 
however, is not one of them. AWEBA uses discrete connectedness to form 
patches in the complement of the boundary. More precisely, a path is 
a sequence p-|,...,p^ o^ pixels such that p» is a neighbor of p^^^ for 

i = 1, ..,n-l. A set of pixels is said to be connected if each pair of 
points in the set is contained in a path lying entirely in the set. 

For example, a singleton (a set containing only one pixel) is a con- 
nected set, as is the entire image. In this discrete setting, the 
concept is simple, but has considerable power. Let A be an arbitrary 
set of pixels. For each a in A, the set of all elenrents of A which can 
be joined to a by a path within A is a connected subset of A, and is, 
in fact, the largest (maximal) connected subset of A containing a. 

Maximal connected subsets of a set are called components of the set. 

The patches of AMOEBA are the components of the complement of the boundary. 
For this to work, only the four nearest neighbors can be considered when 
deciding what a path is. 

In the classification step, again only four neighbors are considered. 
Here, however, we are really making a concession to the relatively low 
resolution of Landsat MSS data, to poor registration of multi -temporal 
imagery, and to computer- time spent in classification. It would 


4 


certainly be possible to consider 8 neighbors; this has. In fact, been 
done In areas dominated by agricultural activity, but for general usage 
with Lariusat resolution four neighbors are enough. 

Circular Buffers . The boundary-finding and classification sections 
of the program require not only a pixel but the neighborhood of the pixel. 
With only the four nearest neighbors to consider, three lines suffice. 
Rather than move the data or maps around, we simply switch pointers, 
rolling old labels or maps out to disk and new data In. The logic Is 
simple, and Is programmed as follows: Initially, pointers II, 12, and 

13 are 1, 2, and 3. II Is the eldest, 12 the current time to be processed, 
and 13 the newest. After a line has been processed, the line pointed to 
by II Is stashed, data is read Into the data slot pointed to by II, and 
the data buffer Is “rotated' by saving II, setting II = 12, 12 = 13, and 
then 13 = the old II . 

Tricking FORTRAN . Short FORTRAN integers are 16 bits long; In the 
HP-3000, this is the word size. Other than 16 bit processing consumes 
time all out of porportlon to the benefit. An obvious case Is floating 
point processing, but long (32 bit) integer processing Is also expensive. 

However, standard FORTRAN two's complement arlthemtic Is simply not 
adequate. We "trick" FORTRAN by biasing all distance calculations by 
-32768 (in octal, 100000, In hex 8000). That is, "zero" is actually 
the bit pattern 1 followed by 15 zeros, and numbers grow from there 
to the largest two's complement integer, 32767. The same trick Is used 
in forming labels for patches of sure pixels. We allow for 64K labels 
by starting the labels at -32767 (-32768 is used to mark boundary). In 
several places, the bias needs to be removed, and care must be taken to 
insure this is properly executed. 

Rejection Thresholds . Boundary pixels come In two flavors: pure 

and contaminated. The pure ones are rare (usually not more than 20% of 
the image), but are easy to model. Consider the spectral picture In 
which a pixel represents the sensor-average of two pure classes. This 
we call a pure boundary pixel. A case can be made for classification of 
such a pixel in the nearer of the classes of which it is a mixture. 
Obviously, such a pure mixture Is nearer the one to which it Is assigned 
than half the distance between. This is the basis underlying the 
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Rejection Thresholds. For each cluster, the Rejection Threshold (calculated 
In subroutine REJECTH and kept in REJECT(NO)) is half the distance between 
this cluster and the other cluster the greatest distance away. (Actually, 
the square of the distance, biased by -32768, is maintained.) 

The model is strictly applicable in the absence of registration error, 
but the model can be extended. The result, easily obtained from Jensen's 
Inequality, is: for registration errors, the otherwise uncontaminated 

distances should be no farther than ^ times the pure model. Pixels which 
fail this test are not classified (or reclassified). However, test pixels 
are believed to be pure, particularly the third in a test set of five. 
Therefore, in MOREQUES, the stricter test is imposed. New classes are 
introduced when the center test pixel fails the strict test. 

Memory Management and Subroutine Linkage . Although the HP-3000 
system allows dynamic array definition, we only use this in opening files 
(these routines must be changed to move the program to another system anyway). 
Memory is managed in the main program in an interger array called WORK (which 
is equivalenced to a logical array LWORK). This gets memory managed but 
makes subroutine linkage difficult to follow. As an aid to the Systems 
Analyst who must maintain the program, we give the exact calling sequences as 
they appear in MAIN, the subroutine version, and the various values of 
memory management parameters. 

THRFND (finds integer vector thresholds, returned in WORK(l)) 

CALL THRFND (NFL,W0RK(MM1 ) , W0RK(MM3), UICB, IND, W0RK(MM4), NR, NC, 

ND, MASK, IMGIN) 

SUBROUTINE THRFND(NFL,INTTHR, SCANLINE. UICB. IND. DOUNT, NR. NC. ND, 

MASK, IMGIN) 

MMl = 1 WORK(MMl) INTTHR(ND) 

MM3 = MMl + ND W0RK(MM3) SCANLINE(NC.ND) 

MM4 = MM3 + ND*NC W0RK(MM4) KOUNT(ND) 

START (using the thresholds, estimate the boundary and create a disk 
file of boundary labels and patch labels) 
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CALL START ( WORK( MM1 ) . ND, NR, NC, NZ. W0RK(MM3), W0RK(MM4), WORK(MMS). 
UICB, IND. IM6IN, IMGCLAS, LABF, MASK) 

SUBROUTINE START(1NTTHR, ND. NR, NC, NZ, DATBUF, LABBUF, ISCAN, UICB, 
IND, IMGIN, IMGCLAS, LAO, MASK) 


MMl. 3 as before WORK(MMI) 
MM4 = MM3+NC*ND*3 W0RK(MM3) 
MM5 = MM4+CN*3 W0RK(MM4) 

WORK(MMS) 


INTTHR(ND) 
DATBUF (NC,ND, 3) 
LABBUF (NC, 3) 
ISCAN(l) 


Note: Parameters NC and NZ and their interaction are described in the 
documentation to START. 


ASELECT (select test sets and write on temporary disk file) 

CALL ASELECT (WORK( MM2), W0RK(MM3), W0RK(MM4), NL, NS, NR, NC, NZ, ND, 
NTS. WORK(MMS), W0RK(MM6), FILENO, UICB, IND, IMGEN, IMGCLAS) 

SUBROUTINE ASELECT(DAT, LAB. KNT, NL, NS. NR. NC. NA. ND, NTS. DATA, 

LABEL, FILENO. UICB. INC. IMGEN. IMGCLAS) 


MM2 = MMl+ND 

W0RK(MM2) 

DAT(NL.ND.NS) 

MM3 - MM2+NL*ND*NS 

WORK(MM3) 

LAB(NL) 

MM4 = MM3+NL 

W0RK(MM4) 

KNT(NL) 

MM5 = MM4+NL 

W0RK(MM5) 

DATA(NC.ND) 

MM6 = MM5+NC*ND 

W0RK(MM6) 

LABEL (NC) 


Note: Parameters NL and NS are described in the documentation to ASELECT. 

THINTSTM ("thin" test sets and form mean vectors) 

CALL THINTSTM(W0RK(MM2), W0RK(MM3), LW0RK(MM4), WORK(MMS). W0RK(MM6), 

N25. N60, N288, N140, N388, N428, ND, NTS, FILENO, UICB, IND) 
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SUBROUTINE THINTSTM(MP, TSP. TIP, CLASS, COUNT. N25. N60. N288, N140, 
N388, N428, ND, NTSI, FILENO, UICB, IND) 


MM2 as before WORK (MM2) 
MM3 = MM2+ND*N428 WORK(MM3) 
MM4 = MM3+ND*N428*5 LW0RK(MM4) 
MM5 = MM4+N25*ND*5 WORK(MMS) 
MM6 = MM5+MM428 W0RK(MM6) 


MP(ND.N140) 

TSP(ND,5.N428) 

UP(ND,5,N25 

CLASS(N25) 

C0UNT(N140) 


Note : N25, N60, N140, N428, N388, and N288 are described in the documentation 

to THINTSTM. 


SORT (sorts test pixel sets in average odd channel order) 
CALL S0RT(W0RK(MM3), W0RK(MM4), W0RK(MM5), ND, N428T5, N428) 


SUBROUTINE SORT(TSPXL, DUMMY, 

MM3 as before 
MM4 = MM3+ND*N428T5 
MM5 = MM4+NP5 
N428T5 = N428*5 


INDEX. ND. NP. NP5) 

W0RK{MM3) TSPXL(ND.NP) 

W0RK(MM4) DUMMY (NP5) 

WORK(MMS) INDEX(NP5) 


NUMCLU (determines the number of clusters and their centers) 

CALL NUMCLU (WORK (MM2). ND, N140, N428T5, WORK(MM3), NFCLUS, MINCLN, 

MAXCLN, W0RK(MM4). WORK(MMS), W0RK(MM6), W0RK(MM7). W0RK(MM8). W0RK(MM9), 
UICB, IND, WORK(MMIO)) 

SUBROUTINE NUMCLU(MEAN. ND, NP5, NP, TSPXL. NFCLUS, MINCLN, MAXCLN, 

CLASS, COUNT. ERROR, SAVE, DUM, CSAVE. UICB, IND, NUM) 


MM2, MM3 as before 

WORK(MM2) 

MEAN(ND,NP5) 

N140 and N428 were modified 

W0RK(MM3) 

TSPXL(ND.NP) 

MM4 = MM3+ND*N428T5 

WORK(MM4) 

CLASS(NP) 

MM5 = MM4+N428T5 

WORK(MMS) 

COUNT(NPS) 

MM6 = MM5+N140 

W0RK(MM6) 

ERROR(NPS) 

MM7 « MM6+N140 

WORK(MM7) 

SAVE(NP5) 

MM8 » MM7+N140 

W0RK(MM8) 

DUM(NPS) 

MM9 = MM8+N140 

W0RK(m9) 

CSAVE (NP) 

MM10 = MM9+N428T5 

WORK(MMIO) 

NUM(NP5) 
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MOREQUES (Classifies center test pixel and adds more clusters If they are 
needed; also Initializes REJECT) 

CALL MOREQUES(WORK(MM2). WORK(HC). MAXCLUS, NFCLUS, NO. NA28T5. WORK(lf6)) 
SUBROUTINE MOREQUES{ MEANS, TESTS, MAXCLUS, NFCLUS, ND, NTS, REJECT) 

MM2, MM3 as before WORK (MM2) MEANS (ND, MAXCLUS) 

MM5 = 20900 WORK (MM3) TESTS(ND, NTS) 

W0RK(MM5) REJECT( MAXCLUS) 

CLASSIFY (performs a spatially checked per pixel nearest neighbor 
classification) 


CALL CLASSIFY(W0RK(MM3), W0RK(MM2), W0RK(MM4), NR, NC, NZ, ND. W0RK(MM5), 
NFCLUS, UICB, IND, IMGIN, IMGCLAS, MAXCLUS, COUNT, MASK) 

SUBROUTINE CLASSIFY (PIXELS, CLUSTERS, LABELS, NR. NC, NZ. ND, REJECT, 
NFCLUS, UICB, IND. IMGIN. IMGCLAS. MAXCLUS, COUNT. MASK) 


MM2 as before 
MM3 = MM2+N288*ND 
MM4 = MM3+NC*ND*3 
MM5 as before 
COUNT Is INTEGERS 


W0RK(MM3) PIXELS(NC,ND,3) 

W0RK(MM2) CLUSTERS (ND. MAXCLUS) 

WORK(MMA) LABELS(NC,3) 

W0RK(MM5) REJECT(MAXCLUS) 

COUNT(IOO) COUNT(IOO) 


These comments should make It easier to follow subroutine linkage and memory 
management. 

Organization of Detailed Documentation . There are only five sub- 
routines with logic complex enough to require a detailed description of 
the algorithm. These are THRFND, START, ASELECT, THINTSTM, ad NUMCLU, and 
will receive mere attention In the documentation which follows. As for the 
main program, IDIMS parameter p.*ompt1ng and file management Is easily 
followed from the source. Because of the elaborate interface IDIMS puts 
between the user and the outside world, the I/O portion of the program Is 
at least three times the length it would be in a normal FORTRAN environment. 
But mere length does not make a program hard to understand, and the main 
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program of AMOEBA Is truly self-documenting. The remainder of this docu- 
ment consists of a listing of the main program followed by documentation 
of each subroutine (in alphebetical order). Appendix A con' ‘ns a 
detailed description of the theoretical foundation of the program. 
Appendix B contains a summary of each of the system subroutines used in 
AMOEBA, with references to IDIMS and HP documentation. Appendix C con- 
tains IDIMS Us'*r documentation, Appendix D shows the listing obtained in 
an interactivfi sample use of the function, and Appendix E shows a batch 
job to use the function. 

Acknowledgeme nt. We would like to express our gratitude to each 
of the many scientists who took time to evaluate the results of AMOEBA 
clustering. Their suggestions and critical remarks led to several 
improvements. 



Program AMOEBA Listing 


PRECEDING PAGE BLANK NOT F 



MPSili'JI 01 .'I rol> IFftN.-’r-f. . , ' hE-Ji » I ' PmI • OFf r-j |«.'c TUt . OC I II. H*1 . * : 31 AH 


• ccciwo ICONftOL ttanCNt ‘ANOCfcAKG 
0«v:02«i>0 ICONfAUE 1. 1 S T . LQt AT lOH . RAF 
«911T CIX'-MODO iUIKOUriNt AHO EC A< U 1C A . 3r AO > 

OOlSi C0004«0» 1.1'COtC*^ RCCIUt.CRA.INAREa. PIINCEO 

oom OOAAIOOO CHACACrCKtl «YnA0L(91> 

ioi;: o«'<4AO«9 iniccer*: uictc i >. 2EA0< i >. ihp< < > moaac ; t»oo >.outtyfc. f itCMO 

»931I 99«9r»«9 INl(^EK>4 91 AC . C OON T( I « 0 > 

»0i13 »»C»IOO« LU.ICAl < f a rf I Lt .LV0*A> 2 1000 ) .la. lb. mask. chamihap.laclhap. clairap 

00111 00001000 CMAKACIER*! (B CNRAP . CAL R AP . ( ACL RAP . E0RASK 

00)13 00010000 CNARACriR»32 INAHC 

001*T 0« >1 1000 CNACACICP«iO lANAR 

Lvss: 00O12000 :macactcc*)2 inihc 

« ‘S?: .-HO).' urcccA>2 narek « >. abpm< a >. codck e > . furuh. chrap .irap .clrrp 

Ovm vOOIOOOO • .BlFSIZKl >.IPCBARUIB>.IRASA.PCINrtL.P«lRTHL.PA(NTBt 

00)11 OOoHOOO CAiiVALCNCt < IVO AK( I > . MORM I > ), < EANAR. NARZI > . < LA . I A > . ( LA . I ■ ) 

oom OOOICOOO CAiMVALCHCf ( 1RARC A . INARC > . ( FRING . TH IR6CA > . ( EACRR AP . CMRAP > 

'-viM OOOIFOOO • . LCOLnAP.LRAP > . ( FACINAP .CLRAP) . ICORAIR. IRA1P > 

ovii; oootaoao ivoicH iniainsic fopen.flneck 

veil! 00011000 SNARE • ’ 

Ovl.'O 00020000 NFAIHS • A 

OOJ.'L 0002IOOO CBNAR • 

OCl.'l 00..22000 •'SrATFlLCPCNfFlOtCHANINAPLABClRAPClASSRAPRACK NiHCLUl RAMCLN* 

V04«; 0'.' >21000 COOEk(l> • LiOOl 

0v4>:v 0OO24000 COOES(2> • XI 

0<.41i 00021000 C00CS(3> • XIO) 

0041% 0OO2C000 C0BES<4> • XIO) 

. C •* I vO.'2/OOO CQ&ESCIV • X103 

OOuAOOOO C00CC<4> - XIO) 

;>;ur oooiiooo cotCKF: > "i 

CL«.'4 OOCIOOOO COOEt(t> • 

•>04/1 00011000 aOOPSII) • lAOORCSK SNAREO > 

cc*.; vconcao > lacc-sE^siHFi > 

Oder 000)1040 AOCAS()> » IAOBRCSSCCHRAP > 

4011 4 00014000 A0CAt(4> « I ACDAES t ( L R AP ) 

0>>12l 000)1000 APOASCD ■ I AAORESK C LRAP > 

0w124 OOOIiOOO ADOPSIO • I AOOAEt S C I RAI X > 

003!! Ov.'iroOO AOCAAin • lABOIESKRIliCLN > 

00140 OOOIAOOO AODAKA) > I AOORESSC R ARC LN ) 

04341 04 .11000 AUtUEvl) • 112 

OOSIV 04040000 CALL A IN VOS ( U I Ct . I NO . I . f LK 1 1 2E . IP C AAR. I 2A > 

001ft 00.)4lv0 IF ilNO.l' LF 0) CALL CNM 0( U 1C A . I NO . I . AL RS I ZE< I > . I PC I AR< I > . 0 . J F > 

OOCI2 004420)0 C •• IHIIIAL12C BEFORE CALL 10 PARARS 

OOCIi 004414.-: HPL • 41 

0C4I4 40444000 tlATPILE ■ TRUE 

OC*U 04.41004 EOTHRAF • 'R' 

OvAir 00>44000 EOlRAP • 'N' 

04A!t 4v)4r004 ESCLRAP • 'N' 

00C41 04441000 EANACK • ’f 

0»014 O4.410C0 RINCLH • 10 

tvtii 04.10000 CNtNIRHf • FALSE 

Ovt(« 04<SI004 LAILRAP • FALSE 

Ovtta OU0S2O00 CL4SHAF • FALSE 

04444 00411000 RAIL • IRUE 

0L 444 00 >14000 RAFCLH • II 

(•ItM. 4>'.11000 CALL P AR AHS c U I C A . N ARE S . C 00 E i . ABBR t . NP APRS > 

Oor; 2 0... 14404 IF . RA tlN Ll -1< OR NaYilN Cl 1A > RAXCLN • 1A 

OOFN 04 irooo N|»S * UICA(4t> 


PBECEDiNG PAGE BLANK NOT FILMED 
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GftiGU\AL PAC.; 

OF POOR QUALITY 


P«CC 0n02 AHOeeA 

•oris 0003(000 N003 > UIC((t2> 

oorso 0003(000 IP CHIOS H(.l> C(LL pmoktiuici.i.o 

00ri3 000(0000 IP (N00« N(.1> coil P0(0RT(IMC(.I .0) 

00 r 0 ( 000(1000 C chick if STOTPILC HHHC It HOT tUPPLieO 

oor(( 000(1000 IP (CoocKi non 1 Eo.o) co to looo 

oorso 000(1000 00 3 I ■ o.t.*i 

ocm 000(0000 IP (tNontt 1 >1 1 Hc. ’ * > co ro ( 

•tool 000(3000 3 CORTIHUe 

•100* 000((000 COLL PA(0RT(UIC(.-(3.0> 

OlOlt 000(3000 ( (HAHC[l*|iri ■ ' (TOTS ' 

01032 000(3000 10 • I 

0107* 000(3000 II ■ I 

0103C oooroooo recsize ■ too * 2*HD 

01042 oooriooo C OPEH fTOTPILE OS OLD TO CHECK FOR OUfLTCOTE HRHE 
01042 ooorzooo FlLNUn • FOPENOSHANE. LO.LO. RECSIZE > 

01033 oooriooo COLL FCHECROO. ERR. .• > 

010(0 000F4O04 IP (CRP.EO.O) COLL PRIOR T( UI CO > 43 , 0 > 

01033 00033000 10 • 0 

01033 0003(000 FILNUn • FOPCNrSHAHC.LA.LR.RECSIZO 

OIIO( 00033000 CALL FCHECK< 0 . ERR. . . > 

01113 00433000 IF (ERR EO.O) CO TO 1110 

01I3C 0003(000 COLL P AlORT ( U I Cl . 43 . 0 . 

01130 OOOIOOOO I09( STATFILE > FOLSE. 

OlITt OOOSIOOO 1110 CQHTIHUE 

011*2 00002000 C>- ASSIOH LOCtCALS 

01133 004(3000 IF ( EOCNnOP El . ' T' > CNAHIHOP • TRUE. 

0114* 00404000 IF CEOLOAP (l.'T’) LAILROP ■ TRUE 

01I3( 00'330C4 IF CEOCLPAP CO . 'T' ) CLOSHAP ■ TRUE 

0II3C 000((000 IF '.CanASK EO.'H ) NOSR • .FOLSE 

OI2C2 000(3000 C -- CO HOPPINC PARAHS 

0I2C2 0000(000 IHCIN « 1 

0I2C4 00003000 IICCLAS • -1 

0120( 000(0000 C 

012:s 000(1000 C OPEN IHPUT to CET no t NC. THEN CLOSE 

0I20( 000(2000 CALL OPEHP I ( U I C( . I HD. IRC IH . I H TYPE . HO . HR . KC , 1 } 

0122* 000(3000 IF < IHCC 1 > . L T . 0 > CALL CHKI 0< V Id > I HD . IHCI H. HD. HR . HC . 1 OO > 

0124* 000(4000 IP < HO . CT l( OR HD LT 2 > CALL PORORT ( U I C( . 4( . 0 > 

012(1 000(3000 OUTTYPE > 2 

012(* OCO((000 CALL CLOSEPIUICI. IHD. IRGIH.O > 

01234 000(3000 IP ( IHD( 1 >. LT 0) CALL CHK 1 0( U ICE . I HO . I HCt H . I . 2 . 3 . (( > 

01313 OO'dOOO IF < HOT ICHAHIHRP Of LOOLHAP OR CLASRAF)) GO TO 8 

0132S 0003(000 HPARHS • 3 

013*0 09100000 PRIHTSL • 1 

013*2 OOlOtOOO FRIHTHL • HR 

013*4 OOlOZOOO PRIHTSS • I 

01334 00103000 ECHAH • 'PRIHTSL PRIHTHL PRIHTSS * 

013(2 00104000 CODES! I) • 21 

01T4? OOlOSOOO CODES'** • 21 

01*30 OOlOSOOO CODE t> • 21 

013*' O' .03000 ODOKStt) • lACCRESS(PRIHTSL) 

01*7'. OOlOSOOO ACC'RS(2> • I AC DRESS! PH I N THL > 

0I40S 0010(000 ADDRS!3> • 1 AD CRESS ( P R 1 H TS S ) 

0I4IZ 00110000 CALL P ARAHS ( U 1 C( , RARE S . C ODES ■ AC DR S . HPARHS '' 

01424 00111000 8 CORTIHUE 

01424 00112000 C 

01424 00113000 C TA*E CARE OF RICE IHAGES: 

01424 00114000 C 
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i 

j 


OF POOR C 



P*CC «4»3 «H0CI« 


0142* OOU9444 C NZ It ACTUAL NI»TH 

41424 44ll«404 C AC It TARCCT HUAtCR CRAttCS AT A TIAC 

41424 40I1Z444 C tACH TUIROUTIAC AUtT AAAACf ACTUAL PAAAAfTCAt FOR 

41424 441IR444 C RCAOR. HRITC4. AND INC IR lUARONTIHCt. 

41424 44119444 AZ • AC 

4l42i 44124444 ACT > 24444/1 1*( AD« I > > 

41414 40121444 44 1441 I • 1.99 

41441 40122444 AC • N2/I4I 

41444 40121444 IF INC LC AC T > tO TO 1442 

41494 44124440 1441 .QATINUC 

01491 40119000 1402 Cv.'tTINUC 

4I49* 40124444 C 

0I*'!3 00121444 C INITIAL CtTIAATCt FOR OUFMRt 

41499 40121444 C AIR -• NUAOCR OF tUFFERI OA READ 

4149* 00129444 C AOU -- AUAOER OF OOFFERt OA ORITE 

01499 00M444O C CET At AAAY READ A* rOttlOLEI 

01499 00111440 ARR > niA4(N4*2.2) 

41414 44112444 AIR ■ 2 

01444 00131040 C 

01414 40114444 C RRCR INACEt FOR REAL AOU 

41414 40119440 C ORCA OUTPUT lAAtE 

41444 00114040 call OPCNPOI U I CA . I HR. I ACCL At . Zt RO . OU TT YPE . I . NR . RZ . NOO > 

01309 OOllFOOO IF < IAD< I > LT 4) CALL CHKI 0< U ICO. IND . I AGCLAt . AR. NZ. AOU . 1 41 ) 

41929 40110440 C OPEN INPUT -• TRY UNTIL NOR IS 2 

01929 00119000 994 CALL OPCNP I ( U I CO . I AD. lAt IN . INTTPE . NO . NR . AZ . ARR ) 

41940 00144440 IF <INDU>.CE 4> CO TO 99C 

01949 00141440 C OUT OF VIRTUAL ACAORTT 

0194* 00142040 IF ( IA4( 2 > . HE 91 ) CALL CAR I0( UI CO . I AD , I AC I A . 

01949 00141040 * NOR. AO V . 4 . 1 0 2 ) 

419(t 40144404 IF (NOR LE.2) GO TO 999 

OI!i: 00149040 NOR • NOR • I 

01911 00144000 GO TO 994 

Oi;.M 00141000 C ARP It LEtt TAAN OR COUAL TQ 2. SO 

0191* 00149040 949 IF INRU ED I AND NORCO l> CALL P AOORK UI CO . I 1 . 4 > 

01412 00149404 IF (AOU CD 2> CO TO 993 

41414 04194004 NOR • I 

01420 00191400 991 AOU ■ I 

0I4.'2 00192040 C REDO OUTPUT VITA FEVER OUFFERS 4 TRY AGAIN 

01422 0OI91O00 call CLO$CP(UI(R. IAD. IACCLAI. I ) 

41t?1 04194040 IF CIND(I> LT 0> CALL CNR 1 0( V ICO . I AD . I NCCL At . 4 . 4 . 4 . 09 > 

914*9 00199040 CALL OPENPOI U I CO . I AD . I ACCL At . ZERO . OUT T Y PC . I . NR . AZ . NOV > 

91414 00194440 IF (IND(1> LT 0> CALL CNR 1 0( U I CO . I AD . I NGCLAS . NZ . AOU . At R . 1 41 > 

01714 00191444 CO TO 994 

41119 00190444 C CAN'T DO IT - INPUT INAGC TOO RIG 

0111? 00199440 991 C ALL P ARORT < U I CO . 1 1 . 4 > 

41729 00144090 C -- ALL OPENS CONPLETE AND SUCCESSFUL 

41127 ('0141040 994 CONTINUE 

41729 00142400 C 

4I.'2? 00143000 C tell NE HOH NAKY OUFFERS I COT 

41129 00144444 URIlt (TNIHC.14I4> NOR. NON 

411*0 00149440 1010 FOPnATtVA YOU AAVE.I1.I7H READ OUFFERIS.' AND, 12. 

01190 00144040 • llA URITC OUFFER(t) > 

411*( 00141040 CALL P F IN TP( U I C 0 . I HD , 1 . TN I ACE 4. 90 . O . 0 , 0 . 0 . 0 , 0 . 0, 0 > 

417,-1 OOI4U040 C lOFM 'ni) It INTTMR -■ USED tV START 

41111 00149400 C 

41111 00114000 C DETCR'INC If THE DATA CONTAINS ANY VALUE CE 1 2 1 

01177 00111000 C FIFtT FEAC THE DATA 
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ORIGIMAL PAGE 13 
OF POOR QUALITY 


fAGC MOei* 


• 1777 

• 2«9I 
•2007 

• 20M 
02021 

• 2091 

• 2071 
•207( 
0210* 
02110 
02I1I 
02110 
02110 
02197 

0217! 

oair? 
02224 
02214 
02214 
0224! 
022*4 
02271 
02271 
02271 
0227* 
021C0 
0230! 
02307 
02312 
023K 
0231t 
02137 
02344 
023*1 
021!4 
02404 
0240 4 
0243! 
0243! 
02439 
02417 
02444 
02490 
02494 
02494 
02911 
0I91* 
02321 
02342 
02342 
02371 
02371 
0Z4I I 
02tlt 
02411 
02414 
02441 


•0I72000 
•OI73010 
00174000 
00l790«« 
OOI7404O 
•0177000 
0017t040 
00I79040 
00190000 
OOltlOOO 
00192000 
OOltlOOO 
00194000 
•019900* 
00I940OO 
00197000 
OOIMOOO 
00199000 
00190000 
00191090 
00192000 
00194000 
00199000 
00194000 
00197000 
00190000 
00199000 
00200000 
00201000 
00202000 
00203000 
00204000 
0C203000 
00204000 
00207000 
09209000 
00209000 
00210000 
0021 1000 
00212090 
0021 3000 
00210000 
00219000 
00214000 
0021 7000 
00219000 
00219000 
00220000 
00221000 
00222000 
00221000 
00224000 
00229000 
00220000 
00227000 
00220000 
00229000 


HRf • lfNR/99 
HC1 - 14NZ7I99 
to 12 IR - 1. NR. HRS 
to 32 K • I.NO 

C4LL RCOtPlUICt. IND. tH0lN,V0RK,2. I . I R . 1 . NZ . I R. K* 1 . 1 , P2 > 

17 ( tNO( 1 >.LT .0) COLL CNK 1 0( 0 1 Ct . I NO . I RC I N . 1 R. NZ . It . 92 > 

00 12 J ■ l.NZ.NCS 
IF (MORK(J) CE.IZO) 00 70 3] 

12 COHriHUe 
60 TO 14 

11 MRITCC TNtNO.ll > 

19 FORntTdOH YOUR INOCE CONTOIHS 0 VALUE OVER 127 > 

CALL FflHTFlUtCt. IND. 1 . TNIHOEO. 38 . 0. 0. 0 . 0 ,0 . 0. 0. 0 > 

HRITEl rHIHG.lEl 

34 F0RKAT(44H FLEASC USE HAF TO PUT INTO THE RANGE 0-127 > 

CALL PFtNTPlUtCt. IHO. I . THI NCEO. 44 . 0. « . 0 .0 . 0 . 0. 0. 0 > 

CALL PA90RT(UtC0.40.0} 

14 CONTINUE 

X • <20740. -FL0AT(HC*(ND4I >>/FLOAT(NO>> 

NL ■ lF|V(SaRT<R>> 

NS • mx<<2l 000 . •FLOAT! 2*NL >‘FL0AT< NC*(NC4 I M >/FLOAT(HO*NL >> 

C 

C TNPFNC P0E!H'T need AS RUCN AOON . . 

NCS > NC 
NC ■ HC*3 

IF (NC .CT .N2> NC ■ NZ 

RNl ■ I 

NNl ■ RNI4N0 

NN4 > nN34N0*NC 

CALL TNRFNOCHFL. VOAK( NNI >.HORK< NNl >, UICO. IND .U0RK<NN4 >, 

* NF.NC.Nt.RASR.INGIN) 

00 17 1 ■ I.NC 

IF (VOFK(I) LE 0> H0RK(t> • I 
17 CONTINUE 

VRITE< THING. I 1 It ) (WORRCKI.K • t.NO) 

111! FOtNAT'’ IHTTNR « ’,I4tl> 

call PFlNTFdilCe. INP. I, THIN6E0. 1043*NP,0,0,0.0,O.C,0.0> 

C 

C RESTORE NC 
NC > NCS 

NN4 - nni*NC*NP*l 
CALL SCrSYHCSYRtOL > 

AN! . XN44NC*3 

CALL STARK UORK( NNI NO . NR . NC . NZ . UORX( NNI ). tORRC NH4 >. VORK( NRS >. 

* UtCt. INt.IHGlN. inCCLAS.LAOF.RASK > 

OLAt * LNOF 

PLAI • PLA0«3274O 
VRtTE< thing. 22221 DLA8 
2222 FOFNATC RLAOELS • '.191 

GALL PRINTFCUICR. INO, I . TNIN6CC. 18.0, 0,0. 0,0. 0,0, 01 

IF! CHANIHAF >C all naff ( FKINTSL , FRINTNL.FAINTSS .Dice . IND. INCIN . 

* NR.NZ.STRBOLl 

2013 IF< LAOLNAFICALLHAFF! f SINTSl, FRINTHL PR IHTSy Uice iND, INCCLAS. 
t NF ,n:, SYROOL > 
am • aNiiND 
Bar • an2«NL4ND4MS 
F1*J • rnuHL 
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ORictxr^i 

OF POOR 


quauty 


PACC 

0003 AMOCIA 


«3««4 

00230000 


AN3 ■ RH44NL 

• 2 <«r 

00231000 


RR« ■ AH3«HC*A0 

03*53 

00232000 


CALL AtCLEC T( HO*K< AR2 >. V03K( HH3 >. «ORE( HA« >. NL. NS . H3 . 

*2*53 

00233000 


* NC.N2.H0.HrS.NBBr(nn31,U0AK(NN*).riLE«0.UICt. IHO. IRCIH, IRCtLAS > 

0270* 

00234000 

C 

NT* 15 THE NUABEA OF TEST SETS STASHED BY SELECT 

0270* 

00233000 


VBt TC( THING. 3333 ) NTS 

02725 

00230000 

3313 rOOnATC 01*7*73 • '.IS> 

02723 

00237000 


CALL PRIN7A(UICB.IHD. 1. THINCEB, l(.0>0.0.9.9.0.«.0> 

0273* 

00233000 


H25 ■ 33 

0273* 

00230000 


N*9 - *0 

027(0 

00240000 


NI40 • too 

027(2 

00241000 


N42S ■ 300*/(HD*3> 

02 7*7 

09242000 


N33S • N42S-49 

02772 

002*3000 


N233 ■ H3*S-tO« 

*2775 

992**000 


tr (N2SS.Lr.lO*> F23* ■ 1*0 

030C2 

00243000 


nni - nN24HD*N423 

0300( 

092**000 


NH4 - HR34HD*H423 S 

0301 3 

00247000 


HRS > nN«*N2S*HD*3 

0302C 

992**000 


HR* ■ RR34R42* 

*5023 

99245000 

c 

H0*K(RH2> 13 REAN(N3.NI40> 

01021 

99250009 

c 

50FKsnn3> IS T8rPXL(RD.R423*3; 

0302 ? 

902310O9 

c 

HOVEVEA. ALLOH FOR H42B IH CASE BE NAVE FEH 

0 302 3 

09232000 


CALL TMlNTlTAt HORKC HR2 1. HORKC RR3>. LH3BK(HR4 ) ,HOBK(HAS >. 

03*23 

09233000 


» HQ3R(RR*9.N23.NS«, N2SB.N144.RSSS.N42B.HB.NTS. 

0 302 3 

90234000 


• FILENO.UtCB. INO) 

03033 

00233000 


H«2BrS ■ H423*3 

0303( 

0923*000 


Rfl* ■ AH34R42Br3«ND 

03070 

00237000 


P.H3 - RN44H42S 

C307I 

09253000 


CALL SO*T(HO*K(Rn3>.VORK(flRO).HORE(AR3>.ND.R42BT3.N42B > 

031C5 

00235000 


RR3 • RN44N42SrS 

OHIO 

992*0000 


«R* - RN34R140 

03113 

092(1000 


RR7 ■ AN*4R|40 

03ii( 

002(2000 


HRS ■ AR74N140 

03I2I 

002*3000 


RH5 • RR34RI40 

0312* 

993(4000 


RR19 • HR54H423TS 

03127 

092(3000 


CALL NUHCLU(V0RK(nR2>.ND.N14O.N42BTS.HORS(AR3>.NFCLUS. 

03127 

002*(000 


4 niHCLN, UORK( RR4 >, HORKC HR3 >. HORKC RH* >. HORKC RH7 >. 

03127 

093*70*0 


4 HORKC RRS >. HORKC RRSl.UiCB. IRO . HORKC RRl 9 >. RRXCLN > 

03l(« 

002*3009 


RAXCLUS > 50 

*3I(( 

092(0000 


RHS ■ 30500 

031 75 

0037*000 


CALL RGREPUESCHOKKC RR3), HORKC HR 3), RAXCLUS. RFCLUS. HO. 

*3170 

00371000 


4 N438T5.H0RKCRH3>> 

03203 

09272000 


CALL NSORTC HORKC HR3 >, NO, NF CL 0$. HORKC RRO. HORKC RR 7 >> 

0321* 

00273009 

c 


0321 » 

00274000 

c 

CLOSE TERF IHACE AND OPEN CLUSTER HRP 

0321 « 

00273000 


CALL CL0SEPCU1C8.IND. IRGCLAS. 1 > 

03225 

0027*900 


IF CIHOCI' LT 0> CALL CHK 1 OC U ICB. t NO . 1 RGCLAS . 0 ■ 0 . 0 . 77 > 

032*7 

99277000 


CALL OCLUOSCUICB. IND> 

0325* 

9027*000 


IF C INOC 1 >. LT 0) CALL C NK 1 OC U 1 CB . I NO , 1 RGCLAS . 0 . 0 , 0 . 73 > 

0327* 

04279000 


IRGCLAS > 2 

03 3CC 

O023090O 


CALL OFENPOCUICB, INO. 1 RGCL AS . ZERO . 1 . 1 . NR . RZ . NBU > 

0332C 

093*1000 


IF CIN0C1> LT »> CALL CHK 1 OC U 1 CB . I NO . 1 RCC L AS , NO . HP . NZ . 75 > 

033*0 

09332000 


RNT rR34ND4N2B5 

*31** 

00233000 


HN4 - tin|4NC*N0»3 

0 3 351 

00334009 


CALL CLASSIFY' V0*r<nn3>.«0RK(HN2),VQRKCnR4>.KR.NC.NZ.N0, 

0335 1 

90353000 


* wcPr < PR5 ' , NFCLU* . U (Cf . INP. IRGIN. I NGCLAF, RArCi I'S . roi'Nt .HAS* > 

03*5 1 

0023**09 


NP» » HC.44T 


18 


ORIGINAL PAGE IS 
OF POOR QUALITY 


rAGC »«o( ahocba 


»14«? 0«2tr0«0 H«irc( TNlN6.2a2>HFCLU( 

«}42( 222 POtHAT(' FIML NUIIBC2 OF CLUtTCRt •'.!]> 

»S42i 00:t«9«0 CALL PF I N TF( U I tt> I HO. I . TNI HeCP . II . 9 • « . 0 . • . • , 0. 0, 0 > 

«?49S 002900AO DO 19 I ■ I.NFCLUS 

924(2 99291999 IF ( COUN K I « I > . LC 9 > GO tO 19 

934P; 99292099 WP I It( TN IHG . 1 12 ) COUNT! I « I >. ( OORKI K< I *N0 > 

9?4r; 90293999 * .R • I . HO > 

92S;« 90294999 11 1 POP HAT! I P . 1 41 4 > 

03S*< 90299099 CALL PP IN TP( U t CO . I H9 . I . T HI HGC9. NPP . 9 . 9 . 9. 0 . 0 . 0 . 9 . 9 > 

0}3«2 90290090 19 CONTIHUC 

one 9929P099 IP (COUNKt ).C0.9> GO TO 449 

91;M 99290099 VR I IC( TN I NG . 444 > COUNT! 1 > 

«l«l( 90299999 444 POPflAT! ' THCRC AtC *.IP.' UNCLASSIFIED.') 

DISK 99300999 CALL PR INTP! U I CO . I ND . I . TNI HGEO. 32 . 9. 9. 9 . 9 . 9 , 9. 9. 0 > 

93C4* 99101990 449 IF < COUNT! 1 99 > . CD 9 ) CO TO 3SS 

93«;« 00102990 «R1 IE! THING. 999) COUNT! 199 ) 

eirco 99101090 999 FORNATCIOH THE NASK CONTAINS .17. OH POINTS.) 

0!7<( 90104090 CALL PP I HTP! U I CO . IND. I ■ T HI NCCO. II . 9. 0. 9 . 9 . 9 . 9. 9. 9 > 

o;r 2 r ooiosooo sss continue 

01727 00104009 I F< CLASHAP )CAL LHAPP! PR I N TSL . PRI HTHL . PR I NTSS . UI CD . IND. IHGCLAS. 

02727 09197900 4 NR.NX. SVNIOL > 

92747 09100099 C ALL CLOSCP! U I CO. (NO. INGCL AS . 0 ) 

02749 00109090 IF (INO!|).LT 9) CALL CHKI 0! 0 ICO. I ND . I NCCLAS . 0 , 0 . 0 . 400 ) 

94092 09310009 CALL CLOSER! U I CD . I ND. INC IN . 9 ) 

04012 09111000 IP ! INC! I >. LT 0) CALL CHK I 0! U IC 0. IND . I NGI H. 0 . 0 . 0 . 399 ) 

94015 09112900 C-- UPITE STAT FILE 

94C25 00111009 IF CSTATPILC) CALL AHSTATS ! P ILNUH . ND. HORN! HN2 ). COUNT . NFCLUS . U I C 0 ) 

94051 99114999 RETURN 

04952 00115000 END 


SVHROL HAP 


HARE 

TYPE 

STRUCTURE 

ADDRESS 

NANE 

TYPE 

STRUCTURE 

ADDRESS 

ACDRS 

INTEGER 

ARRAY 

0*217 .1 

AHOERA 


SURROUTINE 


AHSTATS 


SUOROUTINC 


ASELCCT 


SUDROUTINC 


ATHHCS 


SURROUTINE 


DLKSIZE 

INTEGER 

ARRAY 

0*X21 . 

CHON IHHP 

logical 

SINPLE VAR 

D*2I 11 

CHKIO 


SUDROUTINC 


CHNAP 

INTCCCR 

SINPLE VAR 

D*XS .1 

CLASHAP 

LOGICAL 

SINPLC VAR 

0PXI07 

CLO« SIPV 


SUOROUTINC 


CLNAP 

INTEGER 

SINPLE VAR 

D*XU . 

CLOltP 


SUOROUTINC 


CODES 

INTEGER 

ARRAY 

0*X2I . 

COUNT 

INICGER«4 

ARRAY 

D*X20 .1 

DCLUDS 


SURROUTINE 


DLAR 

INTCCERM 

SINPLE VAR 

D*XI29 

E9CHNAF 

CHARAC TER 

SINPLC VAR 

0*X4 

CCCLHAP 

character 

SINPLE VAR 

0*XIZ .1 

COLHAP 

CHARACTER 

SINPLC VAR 

D«XIO . 

CtHAlf, 

CNARACrCR 

SINPLE VAR 

0TXI4 .1 

ECHAN 

CNAPACTER 

SINPLC VAR 

D*XI4 . 

CAP 

INTEGER 

SINPLE VAR 

D4XI22 

FCNCCR 


SURROUTINE 


P ILtNO 

INTEGER 

SINPLE VAR 

0PXI94 

F ILHUH 

INTEGER 

SINPLE VAR 

9*XI04 

POPEH 

INTEGER 

PUNC TION 


1 

INTEGER 

SINPLE VAR 

D*X24 

IP 

INTEGER 

SINPLE VAR 

D*X27 

lADDRESS 

INTEGER 

FUNCTION 


ID 

INTEGER 

SIHPLC VAR 

0*X19 

MASK 

INTEGER 

SINPLC VAR 

D*XI1 1 

iHcriAs 

INTEGER 

SINPLE VAR 

0*X1I 

I HP. 1 N 

INTEGER 

SINPLE VAR 

0*X73 

I ht 

INTCCCR 

ARRAY 

D*X22 I 

IHTYPE 

INTEGER 

SINPLE VAR 

0*X 71 

IPCBAP 

IHTCGCT 

A P R A , 

9«XZ4 .1 

IP 

integer 

SINPLE VAR 

D»X47 

4 

IHTCCC* 

sinPLt van 

B ♦ 1 .* 0 

r 

I HTEGEP 

SINPLE VAR 

0»X 1 1 7 

LA 

LOCI'; -L 

s rnPL* VrfR 

r*L2r 

1. ABP 

INTEGER 

SINPLE YAP 

D»X52 



OR!G!f*iAL r/K;:: r ; 

OF POOR QUA..nY 


PACE 4««r 

AHOCIA 







LACLHAP 

locical 

SIHPLE VAR 

B«R3S 

LB 

logical 

SIHRLE VAR 

• *X10 

LHAP 

INTEGER 

STAPLE VAR 

B*tr .1 

LUORR 

LOGICAL 

ARRAY 

84313 

RAFP 


SUtROUTINE 


NASK 

logical 

SIHPLE VAR 

B43S0 

HAKCLN 

IHTECCR 

SINPLE VAR 

«*XS1 

HAXCLUS 

INTEGER 

SIHPLE VAR 

B*XTS 

HIHCLtt 

INTEGER 

SINFLE VAR 

B4XS2 

HNI 

INTEGER 

SIHPLE VAR 

84332 

A*l» 

INTEGER 

SINPLE VAR 

a«xi»3 

NN2 

INTEGER 

SINPLE VAR 

B4X33 

NR] 

INTEGER 

SIHPLE VAR 

B4X34 

HN4 

INTEGER 

SIHPLE VAR 

B4339 

NR3 

INTEGER 

SINPLE VAR 

B4X3T 

HNS 

INTEGER 

SINPLE VAR 

04X41 

Rrr 

INTEGER 

SINPLE VAR 

B4X44 

NNB 

INTEGER 

SIHPLE VAR 

• 4X43 

NR] 

integer 

SINPLE VAR 

B*X94 

NOREBUES 


subroutine 


RSOPT 


SURROUTINE 


Ht40 

INTEGER 

SINPLE VAR 

• ♦X37 

N23 

INTEGER 

SIHPLE VAR 

a*XT9 

N288 

INTEGER 

SINPLE VAR 

84X34 

H]|f 

INTEGER 

SIHPLE VAR 

S*XS3 

H42B 

INTEGER 

SIHPLE VAR 

•♦X102 

M42f T9 

INTEGER 

SIHPLE VAR 

BiXI is 

HSO 

INTEGER 

SIHPLE VAR 

• 4X40 

NARES 

INTEGER 

ARRAY 

«*XtS ,I 

NBR 

INTEGER 

SIHPLE VAR 

84X47 

HCW 

INTEGER 

SINPLE VAR 

B4XSI 

HC 

INTEGER 

SIHPLE VAR 

8*X4S 

HC6 

INTEGER 

SINPLE VAR 

84X11* 

HCT 

INTEGER 

SIHPLE VAR 

84X112 

HC 

INTEGER 

SINPLE VAR 

B4X9I 

NFCLUS 

INTEGER 

SIHPLE VAR 

84X120 

NFL 

INTEGER 

SIHPLE VAR 

B4X12I 

NIOS 

INTEGER 

SIHPLE VAR 

• *X42 

NL 

INTEGER 

SINFLE VAR 

R4XP2 

NODS 

I HI EGER 

SIHPLE VAR 

84X43 

MFAF rt 

INTEGER 

SINFLE VAR 

84X53 

HFF 

INTEGER 

SINPLE VAR 

• 4X74 

NF 

integer 

SINPLE VAR 

84X100 

NRS 

INTEGER 

SIHPLE VAR 

• *XSS 

HS 

INTEGER 

SINPLE VAR 

84X101 

NTS 

INTEGER 

SINPLE VAR 

• 4X9S 

NUfCLU 


SUBROUTINE 


N2 

INTEGER 

SINPLE VAR 

•4X111 

OPERFl 


SUBROUTINE 


OPENRO 


subroutine 


OCTT\FE 

integer 

SIHPLE VAR 

84X114 

PAIORT 


subroutine 


PAFAR9 


SUBROUTINE 


PRINTNL 

INTEGER 

SINPLE VAR 

84XS9 

RF IHTF 


SUBROUTINE 


PRINTSL 

INTEGER 

SINPLE VAR 

•*XI89 

FF tRt9I 

INTEGER 

SINPLE VAR 

84X1 13 

READR 


SUBROUT INE 


AECT IZC 

INTEGER 

SIHPLE VAR 

84X44 

SETSYN 


SUBROUTINE 


SHARE 

CHARACTER 

SIHPLE VAR 

8432 .1 

SNANEB 

integer 

SINPLE VAR 

• 4X1 

SCFT 


subroutine 


S8RT 

REAL 

FUNCTION 


STAF T 


subroutine 


STATMLE 

LOGICAL 

SINPLE VAR 

• 4X77 

STHCOL 

CNARACTER 

AFRAV 

84323 ,1 

THING 

character 

SINPLE VAR 

• *X4 

THIHOCG 

INTEGER 

SINFLE VAR 

8433 ,1 

THINTSTH 


subroutine 


THFFHO 


SUBROUTINE 


UICB 

integer 

ARRAY 

Q-X3 

FCFr 

INTEGER 

ARRAY 

84 31 3 .1 

X 

REAL 

SIHPLE VAR 

• 4X123 

ZERO 

INTEGER 

ARRAY 

8-34 ,1 






PKCCfiAl’ U>(ir AN')ESA conpitcc 
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3. DETAILED DOCUMENTATION OF SUBROUTINES 


r nECEDING PAGE BLANK NOT FILMED 
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Parent : MAIN AMSTATS 

AMSTATS(FILNUM,ND,MEAN, COUNT NFCLUS.UICB) 

After the classification step, subroutine AMSTATS writes the means of 
the clusters to a disk file using the standard IDIMS format for 
statistics files. 


Method : AMSTATS receives the means and counts as parameters. It then 
writes all the means and counts to a previously opened disk file, one 
mean vector per record. Vectors with count equal zero are not written. 
The file is written and closed using HP standard intrinsics. Since this 
statistics file does not contain a covariance matrix, it cannot be used 
in maximum likelihood classification or ellipse plotting. 


Program Variables 


FCLOSE 

FILNUM 

FWRITE 


INTEGER ARRAY I/O array 

CHARACTER Creator ID for statistics file 

LOGICAL Carriage control bit mask for FWRITE 

DOUBLE INTEGER Pouplations of classes 

INTRINSIC To close files 

INTEGER File number 

INTRINSIC To write a record 

INTEGER Index for number of classes loop 

INTEGER I minus 1 

INTEGER Class number counter 

INTEGER Index for number of dimensions loop 

LOGICAL ARRAY Equivalenced to I/O array because FWRITE 

requires a logical array 

INTEGER ARRAY Array of mean vectors 

INTEGER NuirtDer of dimensions 

INTEGER Number of final clusters 

INTEGER Number of final clusters plus one 

INTEGER Field that must hold COUNT, overflow possible 

SYSTEM SUBROUTINE 


PRECEDIMG PAGl GlA.4K NOT FJLMED 
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Rt'UFF 

RESIZE 

UICB 


REAL ARRAY Equivalenced to I/O buffer to allow stuffing 
of means in mandatory real format. 

INTEGER Record size 

INTEGER ARRAY User Information Control Block 
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OR!G(NAL ■ ; 
OP POOR QUALITY 


r«(C f«ii NcwicTT-rncirave laieis »i «i rotTiMi/io»« rat. oct ii. imi. vtir «a 


••«!« tCOaiKOl «tCNINT>MOtt«SCC 

••*1* »«*a4«»o r-- INIS tuatouTiNf nriics the STantTict rui rt tni 

••ft* o«4*9«e» c- - conEicMiN or tme rRO(it«n 

o«4*t4«9 EWiRnviiNe *nsTRtE<riLnun.N»,nt«N.covat.iircLUS.aica) 

•«4I« 9««fr»99 RC«L Riurrti) 

9992* 994«a999 STtlCR IRT«IR«IC P«RI Tt . FClO*E 

99914 99499999 IHTCCtR*2 F ll NOR . RC*a( R* . NFCiaf >. ttCS I XI . 

99924 09999999 * tUFF CRt 229 ) . NRRF Tl, RICRt I I 

99924 O010I099 IHf|SER*9 C0VNTr|09> 

99929 99)92999 lOdCRl C9R IROL . LRUFF ( 2 29 ) 

99924 99992999 CNRR9CTCR49 CL99S 

99924 99)99999 f9VIVRLINCf ( RVFFIR .CLRtt > . < tUFFIRf 9 >. NRRFr « >, 

99924 09)9)999 * r ■UFFtf'' 1 4 1 > ■ RtUFF > . C19UFF . RUFFE R > 

99924 99)99999 RCttUC • I4»«2*H9 

99929 0999F099 RMFFER> F > • 9 

99931 99)99999 RFIF ■ RFCLUS * I 

99921 99)99999 CLRSt • ‘«R9lt« 

999)9 99)10999 •UFrtR(9> ■ RD 

90052 99)11999 CONTROL • FRLtf. 

999)9 99)12999 2 ■ I 

999)2 99)12999 C CLAD L COURT 1 ARE UHClR))IFIEO FIRED 

999)2 99)14999 C 

999)2 99)1)999 00 09) I ■ 2.RFCF 

99994 99)19999 IF < COUR T( I ) . E 0 9 > CO TO 09) 

99921 9>>)I2909 C 

9902* 99)10999 C -- RURFT) CAR OVERFIOR OUT TNE CRURRY STRTFILE 

990/1 00)1)999 C •- NR) ORLY RN IRTCGER*2 FIELO RVRILROLt 

90922 09)29999 C 

9992 : 09)21999 RURFT) • COURT(I) 

99191 99)23099 OUFFEROI • J 

90199 09)3)909 1 • 1 * 1 

99192 90)24999 IN • I - I 

90111 00)1)999 00 ))l K ■ I , RD 

9CII2 99)29990 ROUFFO) « RfRN<R,IN> 

OOITC 09)22990 ))l CONTINUE 

oein 90)19090 CALL FURMKFILNUR.LtUFF.RCOIlE.CONTROl) 

OCIir 0o;i)999 IF ( CC > 092.093.001 

9CI4I 90319999 091 CRll F ROORT ( 4 1 CO . 4 3 . 9 > 

9CI5I 90))I099 09) tONTINUC 

99151 90)12999 CRLL F CLO)E ( F I LNUR . i . 9 > 

9CI59 00)11099 IF < CC ' •OT.OOf.OOT 

991(9 993)4999 092 CRLL FAOORT (Ul(l.4).0> 

99129 99)1)999 999 CONTINUE 

99129 09)19099 RETURN 

90121 09)12099 (NO 


(YNIOL RAF 


Nkot 

TYFE 

3TRUCTURE 

AODRE)) 

RARE 

TYFE 

(TRUCTORC 

ADOREtl 

RFITRII 


SUIFOVTINC 


OUFFER 

INTEGER 

ARRAY 

0*1) 

CLAf S 

CNARACTE* 

iinrcc VriF 

0 • X4 .1 

CONTROL 

LOGICAL 

tlNFlE VAR 

0*11) 

COUNT 

IRTECEFX 

ARRAY 

R- t( .1 

FtlOIE 


(OOROUTINE 


FILNUR 

IRTECER 

IIN2LE VAR 

R-lll ,1 

FORITE 


(OOROOT IRE 


1 

INTEGER 

)IRFL( VAR 

(♦XT 

IN 

IRTECER 

(IRFLE VAR 

0*119 

1 

IHtCCIR 

tIRFLf VAR 

I*tl2 

R 

IHTtGIR 

lINFLl VAR 

0*114 

LOUFF 

LOGICAL 

ARRAY 

(•XI .1 

RIRH 

IRTICfR 

ARRAY 

0*12 

RO 

INTEGER 

(IRFLE VAR 

0-RI9 .1 

NFCLO) 

IHTtGIR 

lIRFLG VM 

0-1) . 

RFCF 

INTEGER 

(IRFLE VAR 

0*113 

NURFTI 

INTEGER 

IIRFlE VAR 

0*1) 

FRieRT 


fUOFOUMNC 


ROVFF 

REAL 

ARRAY 

9*14 

REOIEE 

IRTEGEF 

(INFLE V4R 

9*111 

UlCO 

INTEGER 

ARRAY 

9-14 


FFOCRRN unit RNRIRT) C9RFILE9 
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Parent : MAIN ASELECT 

Calls: CLOSEC 

ASELECT(DAT.LAB,KNT,NL.NS,NR.NC,NZ.ND.NTS,DATA.LABEL,FILENO,UICB, 

IKD»IMGIN,IMGCLAS) 

Subroutine ASELECT takes a label map created by START and extracts test 
sets. Both the label map and test sets reside In temporary disk files. 

The test sets are passed to THINTSTM 

Method . Using Wide Image Logic (see above), ASELECT segments the Image 
Into strips. Each strip Is about 6666/(N[Hl) elements wide. Within a 
strip, the labels map is scanned looking for samples having the same 
label. The data values are collected as encountered In a buffer 
DAT(NL,ND,NS). One scan line of data and labels, requiring NC*(N1H1) 
words of memory, are resident. For each label active, buffers KNT 
counting how many and LAB pointing to which are required. Thus 
NL*ND*NS+2*NL+NC+(ND+1) < 21000 Is required. If NL 5 120 Is estimated, 
we have NL*NS < (20760 - NC*(ND+1 ))/ND. We set NL equal to the 
square root of the right hand side, and NS as large as possible satisfying 
the first inequality. This memory allocation is performed In the main 
program. For example, suppose we are processing an Image of NZ ■ 2048 
elements wide. For ND = 2, 4, 8, 12, and 16, we tabulate NC, NL, and 
NS In Table 1. Even In the worst case, sufficient buffer space Is 
available to collect 31 samples. Note that NC/NL Is relatively constant, 
as is desirable. NC/NL is about 56.2 /NU/(ND+1), and /NU/(ND+1) varies 
slowly with ND, e.g. as ND goes from 4 to 16, NC/NL should vary from 
11.6 to 13.2 (these estimates are for very large NZ). 

Subroutine CLOSEC Is called when: 

(a) The number of elements KNT(J) In buffer J for a particular 
label equals NS; or, 

(b) A new label Is encountered and no slots are available to 
stash data having that label; or, 


PHt^. 


FILMED 
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ASLECT-2 


(c) In a new line, an old label not found; or, 

(d) a line with no labels whatever Is found; or, 

(e) when all lines have been processed. 

CLOSEC performs the following functions: 

(1) It closes buffer J by setting LAB(J) « 2 (Z » -32768 marks no 
1 abel ) . 

(2) It shows another slot available by decrementing NA, the slot 
available pointer. (No slots are available If NA ■ NL.) 

(3) If KNT(J) Is at least 5, It selects five test pixels as 
spread out as possible and writes then on disk; a count Is 
kept of this event, called NTS In ASELECT. 

(4) It sets KNT(J) - 0. 

In case (a), that slot Is made avalable. Action taken In (b) Is to seek 
the eldest active label, close that one, and then begin the new label here. 
In case (c), each such buffer Is closed (since this label will no longer 
be encountered). In (d) and (e), all active labels are closed. 

In the Wide Image Logic, a boundary Is generated when a new strip 
Is started. This prevents the bottom labels of one strip from being 
joined to tlic top of the next, and also frees all buffers for a new 
start. 

Program Variables 
CHKIO 

CLOSEC 

DAT(NL,ND,NS) 

DATA(NC.ND) 

FILENO 


SYSTEM SUBROUTINE 

SUBROUTINE Writes test sets on disk after sampling, 
and frees buffer. 

INTEGER ARRAY Used for accumulating patches by 
label (the first variable), dimension (second) and 
by count. 

INTEGER ARRAY One line of data. 

INTEGER The file of test pixels, opened and written 
by CLOSEC. 
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MREAD.J.JS.K 

IM6CLAS 

IHGIN 

IND(l) 

KN 

INT(NL) 

LAB(NL) 


LABEL(NC) 

LOLD 

NA 

NC.NW,NX,NY.NZ 

ND 

ND5 

NL 

NR 

NS 

NTS 

READP 

TSP(80) 


ASLECT-3 


INTEGER DO loop Index. 

INTEGER Label map file number. 

INTEGER Data file number. 

INTEGER ARRAY Error Indicator. 

INTEGER Count number. 

INTEGER ARRAY Running count of nuirijer of each label 
found. 

INTEGER ARRAY Label of particular slot. 

Note: For each J, KNT(J) is the number found so far 

with label LA6(J); these samples are stored in DAT(J,.,1) 
through DAT(J.. ,KNT(J)). 

INTEGER ARRAY A line of labels. 

INTEGER Used in finding oldest active label. 

INTEGER Used to indicate when a se^-ch for an available 
slot should be undertaken. When NA * NS, no slots are 
available. 

INTEGER Used in Wide Image Logic. 

INTEGER Dimensionality. 

INTEGER ND*5; used as a dimension parameter for CLOSEC. 
INTEGER Number of labels collected at once. 

INTEGER Number of lines. 

INTEGER Number of samples for each label. 

INTEGER Number of test sets written. 

SYSTEM SUBROUTINE 

LOGICAL ARRAY Buffer for writing test sets to disk in 
CLOSEC. 
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ASLECT-4 


UICB(l) 

INTEGER ARRAY User Information Control Block 


Z 

INTEGER Boundary marker: 

-32768. 


Table 1. 

Example of Memory Allocation for ASELECT 



Number of Samples = 2048 



ND 

NC 

NL 

NS 

2 

2048 

85 

86 

4 

1025 

62 

63 

8 

683 

42 

43 

n 

513 

34 

34 

16 

342 

30 

31 
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ORIGINAL pass: m 
OF POOR QUALITY 


r*U Ht«L(TT-»«Ck**0 1}|«II *1 FOKTRM/SO* tSI. OCt II. IHI. fill «R 


RRMItRO «UIROliTtNC ASrilC M PM . I. M.KNT. Ml . HI . MR. MC. MI. «• . HTt. (.Mil . 

IMS* e«MIMO • MLCHQ.MCI. IMP. tnilH. IRIllMI) 

• OPIP PPPPIPPP IRTIIIR*! RRrtNl. HP. Rt>. IRKHLI. RHUNl >. MIMHC.hr>. LMKKRO. 

POPIP PPRPPPPP » I.PIl{MO.«ICMI>-IMMI> 

PPPIP PPMIPPP IRIICPl CRHFtOl.Ttf(M> 

PPPIP PPRPtPPP t 

PPPIP ppRpfppp e 

PPPIP PPPPIPPP C PMPRCTIIIi 

PPPIP PPRPPPPP C 

PPPIP PPP3PPPP C TIP •• pump FIR PCLOtCC 

PPPIP PPPPIPPP C MT -- PMtp PCIN6 IPVIP 

PPPIP PPPlIPpP C IM -- IMRtl VCCIPR 

PPPIP PPPlIPPP C KMT -- CPPMT VtCTPR 

PPPIP PPPPIPPP C HNI(I) • P MIAMI ILP1 I II FRCI 

PPPIP PPPPIPPP C Ml •• RAK MUHPIR IF LAPCLI 

PPPIP PPPlIPPP C Ml •• RAK lAMFlII FIR PATCH 

PPPIP PPPPIPPP C HP -- PIHIHIIINAIITT 

PPPIP PpPIPPPP C NC •• IIIMINTI FIR ICAR L IHI 

PPPIP PPPPIPPP C Ml -• PCfttAi NPHPIR PF ILIRRMTI MR PCAH LIRI 

PPPIP PPRPPPPP C HP •• ITARTIMI IIIHINT MAP 

PPPIP PPMIPPP C MT lAIT lllHinT MAP 

PPPIP PPMIPPP C MR •- MMRMI MAP (NC IP TARIIT. PUT HR IP ACTUAl RUHMR> 

PPPIP PPMIPPP C Ml -- HMflRIM IF PCAM IIMII 

PPPIP PPMPPPP C MR -- RURtIR IF LAPIl liPTI PI IHI PIIP 

PPPIP PPMIPPP C MTI •- IMHMIMI ttPMT PF RPNPIR PF TUT IRTI 

PPPIP PPMIPPP C 

PPPIP ppMTPpp t iNtriAiiit 

PPPIP PPMPPPP MPP • HP*3 

pppii PPMPPPP I > -iirM 

PPPII PPPFPPPP COHTROl • TRPI 

PPPIP ppirippp RM • p 

PPPII PPiriPpp NP • p 

PPPIP pPirippp PA IP I • i.Hi 

PPP43 PPITPPPP KNT(|> > P 

PPPIP PPiriPPP IP IARU> • I 

PPPIP PPITPPPP C 

PPPIP pcirrppp t pRoccpt it itrifi amut rc uipi 

PPPIP PMTPPPP IP PI RV ■ I.RI.RC 

PPPII PPITPPPP MT • NUPRC-I 

PPPIP PPPPIPPP IF (HT If Ht> RT « Ml 

PPPTI PPMIPPP RK • HT-RU«I 

PPPTI PPPPIPPP C 

PPPTI PPPPIPPP C FRItlll IT ICAR IIRII 
PPPTI PPIRIPPP PQ IIP IICAD • I, HR 

Mill PPMIPPP t 

PPtPI PPIRIPPP C Ml IF ITAIT IF A RIP ITRIF 

PPIPl PPRPTPPP IF (IRIRP It I PR HU CP t) (P TO PT 

PPIIP PPPIPPPP PO PI I • l-HK 

PC II I PPPIPPPP fp IAPCI(I> ■ I 

PPII1 PPITPPPP 40 TO TT 

Mill PPPPIPPP C 

Mill PPI1IPPP C RCP» lAPIll 

Mill MITIMP pr CPU RIPPF( UlCR. (HP, IRIClPl.lRPli I . I . I Rf PR . MR . HR . 

PPIII PPPIPPPP * I. lltPMI.HP.NK> 

PPII1 PPPPIPPP IF (IHPtI) If P) CPU CRKIMUitP. IHP. IRCCIPI.IRIRP.hr. HI. 9>9> 

M|F| PPPPIPPP C 
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r«U ««2« MCUCT 


Mirs 


c 

set ir Tttae «m • L«tei this line. 

••irs 

•••«•••• 


•3 •• so I ■ I.MX 

•«2«I 

•4t03000 


ir tLOtCL(l>.HC.Z> M TO 99 

•«lll 

•••••••O 


90 eONTtNOC 

••iia 

•OfOIOOO 

c 


••tta 

••foaooo 

c 

ttt tr MT oerm. 

•«aia 

••••S044 



••air 

••••4404 

c 


•vair 

•0303900 

c 

CLOSC *U •CTt9C LMSCLS. 

••air 

•••••••O 


90 40 4 ■ 1. HI 

••aa« 

••••rooo 


tr <L«t<4 >.ni.2> CALL CL0BIC«Z.4>Ttr.L«I.HNT.0AT.HL. 

••aa« 

•••••••• 


• Rt.HO.HA.HTS.FtLtHO.alCS. INS.CSMTRtL.HDS > 

••asr 

•0309444 


to CSHriH«C 

••a<^ 

•••lOOOO 


40 TO 300 

•«a<i 

0091 !••• 

c 


••aci 

009I3000 

c 

CNCCk ro« iNHcmc olo looei. 

••act 

•09I3004 


99 00 !•• 4 ■ I.Rl 

••a«« 

•09I4000 


L • IA0<4) 

••an 

•0913000 


IF (L.CO.ZI (0 TO 100 

•oars 

•09I4000 


00 no 1 ■ i.Nx 

•osca 

00917000 


IF (LHOELI n.tO.L) CO TO 190 

• 01I9 

0091 •••• 


no COHTIHUE 

••III 

0091 9000 


COLL CL0teC(Z.4.TSF.L*0.KHT,Mr.HL.Ht.H».n«.HTt. 

••an 



0 FILtH0.UICt.IR0.C0HT30L.H09) 

• OJ«« 

00931000 


too CONTIRVC 

• OI«l 

•0923000 

c 


•oa«i 

00933000 

c 

FOIHT TO START 

• 0141 

00924000 


4 • 1 

• 0341 

00933000 

c 


• 0343 

00934000 

c 

RtAO OATA 

• •343 

00937000 


•0 40r K ■ I.Nt 

• 03SO 

oooatooo 


CALL tCADPt UlCt. IHO. IHCIH.OATAC 1 . K >. 2. K . t RtAO. HH . HX . 

• 0330 

00939000 


• R«t.lReA».H«.HX> 

• 04»0 

00910000 


IF C IRO< I ). LT.O) CALL CHKI OC « ICO. IHO. IRCIR. R. IRt AO. RC . 330 ) 

•o4ao 

00931000 


•or CONTIHUC 

• 0431 

00933000 

c 

FROCCSS CURRCHT SCAR LIHC 

• 0431 

00933000 

c 


• 0431 

009I4000 


00 300 I > t.HX 

00434 

00933000 


L ■ LARtLtt > 

• 0431 



IF (L CO.Z) 40 TO 200 

• 0414 

00917000 

c 


• •434 

009S4000 

c 

LAtCL FOUHO LOOK FOR A DUFC 

•0414 



IF (LAR(4 ).CO.L> CO TO 210 

• 0443 

00940000 


00 220 4 ■ l.HL 

• 0439 

0094I004 


IF <LA0C4 > CO L ) 40 TO 210 

• 0434 

00942000 


230 COHTIHUC 

•04sr 

00943000 

c 


•043r 


c 

FILL TNROUCH -• NO HATCH FOUHO 

•043r 

00943000 

c 

CNCCK IF ThCRC IS ROOn 

• 043r 

00944000 


IF (HALT NL> CO TO 300 

• 0^43 

00947000 

c 


• 0443 

•094t000 

c 

HO ROON. CLOtC THC ClDEST 

• 0443 

00949000 


LOLO • LAOt 1 > 

40444 

00930000 


4 < I 

• •474 

00931000 


00 310 44 ■ 2,HL 

• 0473 

00932000 


tr (<.A0(4S> CC LOLO> CO TO 310 

• 0302 

00933000 


4 • 4S 
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0RJ6IKAI Fa 
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M«*« Mt 9 «M« L«L» ■ LAKJS) 

Ml*r MM 9 M* II* CMTIMC 

•«ll* **Mt*** C 

**ii* ** 99 r*«* c 9 i»C 9 T IS rsiiirc* ?• *v * 

**3I* **ySM** C*iL ClOttCCt.*. ttf.LM.KNT.C>*r.HL.NS.R*.m.llTS. 

**ii* **fsy*** • riLns.sies.im.coMrML.RM) 

••sir »*ys**** so ri is* 

•* 9 «* ***«i*«* c 

>* 3 «* »**•>*** C CRtCH «P OH LtCICl tOM *«t RCIC MOC 

•* 9 «* MMI**« C 

** 9 «* ******** c rim * utr 

** 9 «* **** 9 *** 1 ** 00 12 * J ■ l.ML 

•• 9*9 ******** IF <L«tCi>.I 0 . 2 > CO to i'i* 

*•991 ***«r*** II* COMtlOOC 

**994 ******** II* Kortll • * 

•* 39 r ******** LM(J> - L 

** 9*2 ***r**** M* ■ R*«t 

** 9 *l ***n*** 21 * K« - RHr( 4 >*l 

** 3 ir ***r 2 **« KOTO) ■ KM 

** 9 F 2 ***n*** 00 «** K ■ I.NO 

***rr ***r**** «•* •*t<km.k.*> ■ mt*( 1 .k> 

•**ir ***F9*** c 

***ir ***r**** c CLOot ir km ■ ml 

***ir ***rro** IFCKM.M.RLI tM Ci. 0 *fC( 2 <l •tSr.LMO.KNf .••T.HL.NS.H*.N*.MtS. 

•**ir ***r0*** • riLKHO.OSCB. IR? t'J'URRL.ROSI 

*** 9 I **?'**** C 

**« 9 I *** 0 **** C COLOMM L* 0 r CRM 

*** 9 I ***•!*** 2 ** COMTIHMC 

**•92 *«* 02 *** C 9 CRM LOOK fRO 

*•*92 *«* 0 I*** 9 ** CORTtli.C 

•** 9 * ******** C 

*** 9 « *** 09 *** C STRir LOOM tRO 

*•* 9 * ******** ** COMTIHMC 

***•2 ***or*** c 

*«*I 2 »•*•*«•* C CLOSC CVCRTTHIH 6 IM SICHT 

**••2 ******** 00 *** 2 > t.RL 

***tr ******** IFCLR0(2>.HC 2)CRLL CLOSCCI 2. 2 . TSF .L*0. KRT .*RT . RL .HS . HO. HR. RT t 

•***r ****!*** • riLCnO.OIC*. tM0.C0HTR0L.RD9) 

*0222 *** 92 * 0 * **• COMTIRVfc 

**r 2 l *««*]*** R C T U R M 

**ri 4 ***« 4 *** CHO 


H*RC 

TYFt 

tTRUCTUtC 


•(tLCCT 


tUOROVTIRC 



CLOttC 


SROROOTIrC 



o*r' 

IHTCCCR 

• HRHY 


■ •224 

• I 

riLiHR 

IHTCCCR 

SIHFLC 

YHR 

0-210 

# 1 

IRSCLHt 

IRTCCCR 

SIHFLC 

VRR 

0-24 

, 1 

IR* 

IRtCCRR 

• RRHY 


0-2* 

« 1 

2 


SIHFLC 

VRR 

0*212 


R 

IRttORR 

SIHFLC 

YHR 

0*221 


RRt 

IRtCRRR 

hffht 


• -222 

F I 

LRO 

IRItCIR 

HRRHT 


0-221 

• 1 

L*L* 


SIHFLC 

YRR 

o*2ir 


RC 

IRtCCtR 

SIHFLC 

YHR 

• -2I* 

• I 

R09 

IHTCCCR 

S IHFLC 

YHR 

0*211 


HR 

IHTCCCR 

SIHFLC 

YHR 

0-217 

0 1 

RTS 

IHTCCCR 

SIHFLC 

YHR 

■ '211 

• 1 

RX 

IHTCCCR 

SIHFLC 

YHR 

■ *2I* 


HI 

IHTCCCR 

SIHFLC 

YHR 

0-219 

* I 

ttr 

LOCtCM. 

HfRHT 


0*29 

F 1 

I 

IHTCCCR 

SIHFLC 

YHR 

0*222 



HHHC 

TYFC 

STRUCTURC 

HOORCSS 

CHKIO 

COHTROL 

LOCICHL 

SOOROUTIHC 
SIHFLC YHR 

• *22I 


DHTH 

IHTCCCR 

HRRHY 

0-212 

. I 

I 

IHTCCCR 

SIHFLC VHR 

0*2* 


IHCIH 

IHTCCCR 

SIHFLC VHR 

0-29 

. I 

IREHO 

IHTCCCR 

SIHFLC VHR 

0*219 


4S 

IHTCCCR 

SIHFLC VHR 

0*211 


KH 

IHTCCCR 

SIHFLC VRR 

0*224 


1 

IHTCCCR 

SIHFLC YHR 

• *214 


LROCL 

IHTCCCR 

HRRHY 

0-211 

.1 

MR 

IHTCCCR 

■IHFLC VHR 

0*27 


RO 

IHTCCCR 

SIHFLC VHR 

0-214 

.1 

Ml 

IHTCCCR 

SIHFLC VHR 

0-221 

.1 

MS 

IHTCCCR 

SIHFLC VHR 

0-22* 

.1 

MR 

IHTCCCR 

SIHFlC VHR 

0*214 


HY 

IHTCCCR 

SIHFLC VHR 

0*22* 


RCHOF 

OICO 

IHTCCCR 

SOOROUTIHC 

HRRRY 

0-27 

.1 


FROCROR OMIT RMLtCT CORMLI* 


Parent : 

Calls: 


MAIN 

PERPIXEL, MARKUP, FIXUP 


CLASSIFY 


CLASSIFY(PIXELS, CLUSTERS, LABELS,NR,NC,NZ,ND. REJECT. NFCLUS.UICB,IND, 

IMAGE , IM6CLAS .MAXCLUS, COUNT. MASK) 

This subroutine performs a spatially supervised classification of multi- 
image data. The underlying spectral classifer is a nearest neighbor 
(Euclidean distance) per pixel classifer. Such a classifer behaves poorly 
on mixtures: a point on the spatial boundary between classes will some- 

times be classified in another actual class. In CLASSIFY, the mildest 
possible reclassification is performed: only points classified unlike 

all four of their neighbors are reclassified, and even these are left 
alone if the nearest class of a neighbor is too far away. 

Method: CLASSIFY uses: Wide Image Logic, the Mask, Four Neighbors, 
Circular Buffers, and Rejection Thresholds. These concepts are documented 
separately. Assuming they are understood, the method can be described 
briefly. In each (wide image) strip of data, a circular buffer of three 
scan lines of data and labels is formed. Initially, all three lines are 
classified (subroutine PERPIXEL). Then a big loop is entered (label 30), 
and the center, pointed to by 12, is marked (subroutine MARKUP) to 
indicate pixels classified like at least one neighbor. Subroutine FIXUP 
is entered to reclassify unmarked pi.xels like one of their solidly 
classified neighbors. (These subroutines could, of course, be differently 
restrictive.) Then the eldest label line, pointed to by II, is written on 
disk, the buffer is rotated, and a new line of data is read and classified. 
When no more data can be read, the last two lines of labels are written, 
and the next Wide Image Strip is processed. 

A count is kept of the number in each class. This is returned in 
vector COUNT, a long integer array. COUNT(l) is reserved for unclassified 
elements (which appear on disk with label 0). C0UNT(2) through C0UNT(99) 
are the number ii? class 1 through 98. COUNT(IOO) is the number of points 
in the Mask, given label 99 on disk. 
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CLASSIFY-2 

The classification routine may introduce new classes; because the 
data is only scanned once, the new classes will not be attractors until 
they are formed. The reason for introducing new classes lies in the 
profound unpopularity of unclassified pixels, as well as the stubborn 
adherence to the mixture model. Since these classes are usually small, 
the percentage of errors is likely to be tiny. 

Program Variables 

CHKIO SYSTEM SUBROUTINE 

CLUSTERS (NDjMAXCLUS) INTEGER ARRAY The cluster centers or attractors. 

Their actual number is NFCLUS, which may be changed 
in PERPIXEL. 

LONG INTEGER ARRAY The count of the number in each 
cluster. COUNT(l) in the number unclassified, COUNT(IOO) 
is the number in the mask, and COUNT(I) is the number 
in the cluster I-l for I = 2,..., 99. 

SUBROUTINE Processes points classified unlike each 
of their four neighbors, attempting reclassification 
according to the mixture model. 

INTEGER DO loop index 

INTEGER Circular buffer pointers 

INTEGER Input image number. 

INTEGER Output image number. 

INTEGER ARRAY Error indicator. 

INTEGER Line numbers on read and write, managed in 
each strip. 

INTEGER Scratch variable, used to rotate buffer. 

INTEGER Used to index into COUNT while counting 
LABELS. 

INTEGER DO loop index. 


COUNT(IOO) 


FIXUP 

I 

11,12,13 

IMAGE 

IMGCLAS 

IND(l) 

I READ, I ROW 

IT 

JC 

JJ,K 
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CLASSIFY-3 


LABELS(NC,3) 

INTEGER ARRAY Circular buffer of classifications. 

MARKUP 

SUBROUTINE Adds 101 to the center label when that 
label is like at least one of the four neighbors. 

MASK 

LOGICAL If .TRUE., a value of 0 in channel 1 of the 
data is classified "mask" and labelled 99. 

MAXCLUS 

INTEGER The maximum number of clusters allowed. 

NC 

INTEGER Width of each strip. 

ND 

INTEGER Dimensionality of data. 

NFCLUS 

INTEGER Dynamic nuntf)er of clusters. 

NR 

INTEGER Number of lines. 

NW.NX.NY.NZ 

INTEGER Used, in loop 96, to segment the image into 
strips. 

PERPIXEL 

SUBROUTINE Performs a per pixel nearest neighbor 
classification. Introduces new clusters when the 
nearest neighbor is too far away to fit the mixture 


model. 

PIXELS(NC,ND,3) INTEGER ARRAY One circular buffer of data in a strip. 
READP SYSTEM SUBROUTINE 

REJECT(MAXCLUS) INTEGER ARRAY The rejection thresholds. 

UICB(l) INTEGER ARRAY User Information Control Block 


WRITER 


SYSTEM SUBROUTINE 
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PUCC ••If HtfLCTT 

•PACKMO l2IO2f.0!.«l FMTRAN/I*** TfC. fCT II. IffI . ft 

• ••I* 

0099f000 

tCOMTirOL •CGIIENT*AHOttMEC 

• 0019 

00939000 


(UffOUTINE CLASS irYGPIKELS.ClUSTERt.LffClt.Nf.HC.NZ.IID.REJI 

• ooi; 

00900000 


« N'CLVS.UICf. INO.inACE. 1 HCClflS . HOXGL US . GOON T . Il«t K > 

• 0019 

OQ90I00O 


INTEGEP*? iJICSt 1 IND? 1 7 . RE J ECT ( NAXC LUS > . CL (1ST EKS< NO ■ HNXCl ' 

•oci; 

00942000 


♦ PlXELIf nC. HP. 3>.LNPELt< NC . 3 7 

• 0019 

00903000 


tNTCCER74 COUNU !)0 ) 

00019 

00900000 


LOGICAL NASK 

09019 

00309000 


DO 1 ! ■ 1.100 

00022 

0090«000 


1 COUNT! 1 > • 0 

00027 

00907000 

G 


0rc27 

oo9oaooo 

C 

•IPE INACE LOGIC 

0002* 

00909000 

c 

NH IS STARTING COL IN IHAGE 

• 0027 

00390000 

c 

NY IS ENDING 

• 0027 

00991000 

r, 

NX IS ACTUAL NUNRLt READ 

00027 

00992000 

C 


00027 

0039«000 


DO 9( NH • I.NZ.NC 

00C34 

00990000 


NY • NU4NC-! 

00000 

00399000 


17 THY.GT NZ> NY ■ HZ 

• 0009 

0099«00O 


NX • NY-NH«1 

• 0091 

00997000 

C 


00091 

00399000 

C 

SET READ/URITE COUNTERS 

00091 

00399000 

C 


0009! 

009(0000 


IRCAD ■ 3 

00C9J 

00901000 


IRON > 1 

0CC9; 

009(2000 

c 


00099 

003(3000 

c 

SET ur CIRCULAR BUFFER POINTER 

OCOJt 

003(4000 

c 


• 0099 

009(3000 


11 • 1 

00097 

009((000 


12 « 2 

000*1 

003(7000 


13 ■ 3 

000(! 

oo3»eooo 

c 


000«! 

003(9000 

c 

GET STRRTCOl READ 3 SCON LINES DATA 

OOOt! 

00970000 

c 


OOOf T 

00971000 


DO 20 I ■ 1.3 

0OC70 

00 9 72 00 ? 


00 10 t • 1 ,NC 

00079 

00973000 


CALL REAPPC UlC*. IND. INASC-PtF }( 1 , X , 1 > , 2 . R > 1 . NV , 

00079 

00974000 


* NX,K*1. I .NN.NX7 

00 190 

00973000 


IF ( INP! I >.LT.O> CALL CIU • 2; L* Kf. 1 HD . 1 RAGE . I . K . NC . 1 1 > 

• 0190 

00973000 


10 CONTINUE 

0CI91 

00977000 

c 


00191 

0O37IO00 

c 

CLASSIFY FIRST THREE SCAN LINES 

0C191 

00979000 

c 


0CI9I 

oostoooo 


20 CALL PERPIXELtPIXCLS! 1 . 1 . 1 >.CLU3TERS.LAtElS< 1. 1 >. 

• 0191 

00991000 


o NO , NC . NFCLUS . RE JEC T . HRXCl US . HASR , NX > 

0C2C 1 

00392000 

c 


• CSC 1 

00393000 

c 

?.£FEfE*(rE FOR BIG LOOP 

O02( 1 

00184000 

c 


00201 

009*3000 

c 

NAPF PIXELS CLASSED LIRE TNEIR NCIGNAOAS 

0020 1 

OC"9(000 

G 


ooze 1 

0099700? 


3? C«.L HARKUP! LABELS . NC , n . I 2, 1 3. NX ' 

• Oil t 

00 31 » 00? 

c 


002! 1 

00919000 

c 

USE CCHTEVT TO A^TENFr RCCL ASS 1 F 1 C AT I ON 

OCZ! 1 

00990000 

c 


OCZI 1 

00991000 


CALL FI>;UF(LAtEL3.NC,li.I2.I3.FlXELS'I.I.I2>.REJErT, 

oezii 

0''99200? 


♦ C'uUSTCFS ttOiNAF'.LltSNX' 

00271 

OO91SO0? 

i: 


OCX!* 

o^noo" 1 

c 

.F!c » s:a« like ur thtrLS 


PRECEDING PAGE BLANK NOT FILMED 
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MkCC 


••29C 

0«2Y< 

• Otf» 
OOlll 
•eiK 
•cm 
•cm 

OCIt« 

••lit 

••»( 

•oitr 

• 0340 

• 0344 
00344 
•0 344 
00344 

• 0 391 

• 0 391 

• 04C4 

• 0424 

• 0439 

• C429 

• 0429 
'C421 

• 0423 

• C4II 

• C4II 

• C499 
•0419 

• 0439 

• 0419 

• 041! 

• 04(4 

• C4(( 

• 04(4 
«C4I( 

• 04(4 

• 04(4 

• 0444 

• 0930 

• •340 

• 0949 

• 0999 

• 09(9 

• •949 
00(22 
CC442 

• 0447 

• C497 

oo4(r 

• 047! 

• 0474 


• 419 CLMtirV 


••I99»*« 
•0994900 
••99r«00 
••99I004 
0O99900* 
00400009 
0040I090 
00442090 
••••I 044 
••••404* 
••(•9»0« 
00404009 
00443099 
••(••••O 
00409044 
0 • 4 I•••• 
00411000 
004I2C00 
004II00O 
•••I4049 
••4I9009 
004I4000 
0441 3000 
•OdfOOO 
044I9040 
••420000 
00421000 
00422000 
00421000 
•0424000 
00429000 
00424000 
00423000 
•0429000 
00439040 
•0430000 
00(11000 
004I2000 
00433000 
00414000 
00419000 
00434000 
00413004 
00439000 
00419000 
00440000 
00441000 
00442000 
00441000 
•0444000 
00449000 
0044400<> 
00443000 


C 

CALL 9RttlP(VIC9. INP> I96CL99 < LMtCLK l< II )>2> I. 190V. MH. NX, 

* I 1201141. MV.NXI 

IP ( IN0< I >.LT .0) CALL CNKI0(VtCA.tN»,IACCLAt.lR0V.0,0,2l ) 

00 a 04 ■ I.NX 

OC ■ LAOCLSIOO. I I >41 

2 eOUNfC K ) • COUNT<Or>4| 

c 

c poiHr TO Ncxr scan line 
c 

IRON ■ IR0R4I 

IRtAO > IREA041 

IP ( IRCRC 6T NR> CO TO UOO 

c 

C CRAl ANOTHER OCAN LINE 
C 

00 40 K > I .NO 

call REAPPIUICA. INO. I RACE, PIXELS! I .X, II >.2,K,IREA0. 

• HH,NX.K4|.IREA0,NR,NX) 

IF ( INOC I >.LT .«> CALL CNRIOC 0 ICO. INO . TRACE. R . M . IRE AO , 243 > 

4« CONTINUE 
C 

C ROTATE CIRCULAR OUFFER 
C 

IT - II 

11 ■ 12 

12 • IT 
II • IT 

C 

C CLA((IFY TNE NEW CRITTER 
C 

CALL PERPIXELCPiyELSI I, I , I 3 > . CLU9 TEAS ■ L AREL(( I . 13 >.N0, NC. 

4 HFCLiI.REOECT.NAXCLUS.NRSK.NX) 

CO TT 30 
C 

C FINI9N UP 
C 

1000 CONTINUE 

call VtlTEPIUtCt. IND. IHCCLAt. LARELS! i . 1 2 > . 2 . I . IR OU, 

• NU, NX, I , IR0U4 I.NH.NX > 

IP (iNC(l).LT 0) CALL CNKIO<UICO. IH0.INCCLAI.I2. IR0U.NC.2R2> 

00 3 24 ■ l.NX 

2C > L AREL9( 22 . 12)41 

3 COUNT! 2C > ■ C0UNT(2C)4| 

CALL HRITEPIUItO. IHD. I HGCL AS , LAtC L S( I , I T > . 2 , I • IRO V4 I , 

* NR, NX. I , I , I . t > 

IP (INCm.Ll 0> CALL CHXIOlUICe. INI , tnCCLAS. IT IROW HC.2C7> 

00 4 20 • I .NX 

2C • LAICLS<22.11>«I 

4 C0'iNT(2C> ■ C0UHT(2C>4| 

94 CONTINUE 

RETURN 


ENO 
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r«C( »«ao CLMIIFV 


tYMOL HUF 


NAAC 

TYPE 

STRUCrUPF 

MDBtESS 

NMME 

TYPE 

STRUCTURE 


Chkio 


Sl'BPOOTItiC 


CLASSITY 


SUBROUTINE 



CIUMIFS 

iNrcce* 

AtPAT 

a-Mi .1 

COUNT 

INTEGCt*4 

ARRAY 

B>t9 

.1 

riMif 


80M00TINC 


1 

INTEGER 

SINPLE YAP 

a*to 


II 

IHTCCCR 

IIHPLE YAP 

S<LI« 

12 

INTEGER 

SIMPLE YAP 

B*KIS 


II 

INICCCR 

SIMPLE VAt 

B4YK 

IMAGE 

INTEGER 

SIMPLE VAR 

B-LiO 

. I 

inCCLAf 

INTCCC* 

SIMPLE YAP 

8-»T .1 

INB 

INTEGER 

ARRAY 

0-kii 

.1 

IRtAB 

IHYCCE* 

SIMPLE YAP 

a*«i 1 

IRBU 

INTEGER 

SIMPLE VAR 

• *KB 


IT 

INTECFR 

SIMPLE VAR 

• *tT 

JC 

INTEGER 

SIMPLE VAR 

B*K2A 


i4 

IPTCGEP 

SIMPLE YAP 

a*ti 

E 

INTEGER 

SIMPLE VAR 

B«tlP 


lAKLI 

INYCCEP 

APPAT 

U-Z7I .1 

MARKUP 


fOBPOUTIHE 



ttAtr 

logical 

SIMPLE yap 

0-tA .1 

NAXCLUS 

INTEGER 

simple VAR 

B'LA 

.1 

AC 

IHTCCEt 

simple yap 

a-siP .1 

MB 

INTEGER 

SIMPLE VAR 

O-KIS 

.1 

HFCLUI 

IRTCCCP 

SIMPLE YAP 

a-siJ .1 

NR 

INTEGER 

SIMPLE VAR 

B>K2B 

.1 

MM 

iHTtGCR 

SIMPLE YAP 

a«tio 

NX 

INTEGER 

SIMPLE VAR 

R*tll 


SV 

IHYICEP 

SIMPLE VAR 

p«ti I 

H2 

INTEGER 

SIMPLE YAP 

0-LIS 

.1 

fCMPtXCL 


SUBROUTINE 


PIXELS 

INTEGER 

ARRAY 

B'K2I 

.1 

ICABA 


SUBROUTINE 


REAECT 

INTEGER 

ARRAY 

B-XI4 

.1 

Old 

INYECCt 

MPPMT 

a-iii . 1 

BRITEP 


SOBRBUTIHE 




F»OC««n UMIT CLAfSIFY CORMLCD 
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Parent: ASELECT 


CLOSEC 


CLOSEC(Z.J,TSP.LAB,KNT,DAT.NL.NS.ND.NA,N.FILENO.UICB,IND,CONTROL,ND5) 


In this subroutine, a patch of pure pixels is closed by selecting a 
test set from the patch (5 representative pixels) and writing the 
test set to disk. 


Method : If the file has not yet been opened (i.e., if CONTROL is true), 

then it is opened. The file name is TSTPXL where is a number 

from 1 to 99, depending on how many test pixel files are currently 
open in concurrently running sessions. The test pixel file is job- 
temporary, that is it is deallocated when it is closed. 

If possible, five test pixels are selected from the patch by 
choosing them at equally spaced intervals along the array. This test 
set is written to a disk record. The test set count is incremented 
and the count of occupied labels is decremented. 


Program Variables 

CONTROL LOGICAL If CONTROL is true, the test pixel file is 


DAT (NL. NO, NS) 

FCHECK 

FILEINX 

FILENO 

FILESIZE 


opened and CONTROL is set to false. 

INTEGER ARRAY At J, contains the list of pixels 
(brightness values) that constitute the current patch. 
SYSTEM INTRINSIC to check for I/O errors 
INTEGER Part of test pixel file name 
INTEGER Test pixel file system file number 
INTLGER*4 Maximum number of records in test pixel 


file 


FNAME 

FOPEN 

FWRITE 

I,IS,K,KN 

lA 

IB 


CHARACTER Test pixel file name 
SYSTEM INTRINSIC Opens a file 
SYSTEM INTRINSIC Writes a record 
INTEGER Do Loop Index 

" File options bit mask 

" File access bit mask 


PRECEDING PAGE BLA:^K NOT FILMED 
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I ERR 

IMSG 

IND 

ITSP 

J 

KNT 

LA 

LAB 

LB 

LISP 

MSG 

N 

NA 

ND 

ND5 

NL 

NS 

PABORT 

PIXELND 

PRINTP 

TSP 

UICB 

Z 


INTEGER I/O Error code 

" ARRAY holds message to be printed 
" ARRAY IDIMS error Indicators 
" Temporary test pixel storage 
" The pointer into KNT and DAT for the current 
patch to be processed. 

INTEGER ARRAY At J, the population of the current 
patch. 

LOGICAL File options bit mask 
INTEGER ARRAY Label vector 
LOGICAL File access bit mask 
LOGICAL Test pixel temporary storage 
CHARACTER Message to be printed 
INTEGER Count of test sets written to disk 
" Count of occupied labels 
" Number of dimensions 
’• " " " times five 

" " " lines 

" " samples in strip 

SYSTEM SUBROUTINE 

" Counter for pixels written to a record (1 to 5*ND) 
SYSTEM SUBROUTINE 
LOGICAL ARRAY Test pixel record 
INTEGER ARRAY User information control block 
" Absolute zero, see "Tricking Fortran" 
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OWGtNAL PAGE IS 
OF POOR QUALITY 


t*l« HIVLCTT-MCtM* Uioat.tt *} fOttHN/IO* 


T«c. ocr II* i««i> «iit -<•* 


•f»as 


•CaaVML StCHCHTaDHOCtMtC 

•o*aa 

Q»rrr«9(‘ 


CUatOUriNr CLO<tC(a.4.r(P.L*a.KNT.»*T.NL.III.II».Nft.N.riLINO. 

•ocaa 

••rrt*«« 


1 ulca. m»>coNraoL.N»s I 

•••as 

•»aa9»9c 

c 

THIS SUiaturiNt is • POKIION op tHt •ROCS* IMRS PSMCTIS* 

•••as 


c 

tr CNootct rcsT pixcli mod • fMrcH rHur is seinc closeo^ 

•••as 

99r$lt99 

e 

•NS •liars i»ff«8PRi«rc csoNriRi. 

•••as 

99T$t999 

c 


•••as 

99799999 


LS6tc«t. irsp.rspcHSS).Ls.i.SieoNrROL 

•••as 

99T99999 


inrtcfR*4 PUfStac 

•••as 

99T99999 


INTC6ER*a Z.ITSP.O«r(HL.Ne.M).RMT(l I.LRSCI >. IHSCC 3S >. PI Lt INK 

•••as 

99799999 


tRUtVNLENCC (LTSP.ITIPI 

•••as 

99797999 


IHTPCftca PIXfLNS. IN. IS.OICRt 1>.IN»( II.PIIRNS 

•••as 

99799999 


SYSTIN INTRiltIC rOPIN.FCNfCR.PNRITt 

•••as 

99799999 


CHNRNCTtRcPI NIC 

»^«as 

99799999 


eN«PNCTER*9 PNMt 

•••as 

99791999 


CtUIVNlCNCC <L«. l»).iL9. !■>.< INSC.NSCI 

••cas 

99799999 


IP < .NOr CONTROO 60 fS S 

•••!• 

99791999 


IN ■ • 

•••la 

99799999 


riLfiizc ■ csssi 

•••s« 

99795999 


IS • 4 

•o«tl 

99799999 


CONTROL ■ PNISC. 

•••«• 

99797999 

c 

oriH INC •lit PILP 

•••«• 

99799999 


PNRNE • *f»TPKL .* 

•••S( 

99799999 


OS S PILCINX ■ I.S9 

•»••) 

99999999 


PNINKinai • STRlPILillHX) 

••!•• 

99991999 


PILfNO • 0 

•«i«a 

99991999 


PUfNO > POfCNCPNRRt.LN.lS.NSS. .. .a. .puasiai. t«> 

•cia« 

9999S999 


IP < cc It. 1. a 

•«iai 

99994999 

a 

CONTINUC 

••lai 

99995999 


CRIL FCHtCKFILSNO. flRR) 

• •ISO 

•«•••••• 


IP CUM tS.I«O.OR ItXR as lot) CO TO s 

• •!«• 



NIC > ' OPIN aXROR ON TCIT PIXIL OUR PILE* 

•cirs 

04a»«0«0 


Nice ia:S J ■ STRC ICRRI 

••ait 

99999999 


CRLL PRINTPCNICB.INO. 1. INS6. 4 I. •.•.•. •,•.•.•.•> 

»«a«a 

99970999 


CALL fXSORrcOICS .4S . e > 

••asa 

90911999 

• 

CONMNUe 

•«ast 

90977999 

I 

CONTIRue 

•«ass 

99915999 


XX • 99714) 

•«ast 

99974999 


irCKH Lr.SI 60 TO !•• 

•oa«a 

99975999 


II • CRN-ll/4 

• <3I« 

99979999 


PIXCLXQ > • 

•«ato 

•««l 


00 t« 1 • 1 .KN.lt 

9<in 

«4«l*«4« 


00 14 K « 1 .HO 

•eica 

•••I909* 


PIXCLNQ ■ PIXtLHO * 1 

•cats 

coaa»»4* 


ITIR » ONTC 7.7,4) 

• CSIS 

ccsaiccc 


le rtfCRlxCLHOI ■ LTSP 

•ciaa 

ec«aac4c 

c 

•Hit: >:«r pixels to oiik rut 

•ciaa 

•a ••!••« 


CALL FHRI racPtLCNO. TIP. NFS . CONTROL > 

• CISC 

cotaccoo 


IP C CC > 4 .S.4 

•csia 

04«as««« 


4 CALL FCHCCXCPILENO. ICAR ' 

•oiar 

cotatooo 


NIC ■ * ERROR OUilNC UCITC OF TEST PIXEL FUE* 

•csa« 

99977999 


RICCI'll ■ IfRCICRR) 

• C4II 

c««ai««4 


CALL PR 1 NTPC 01 Cl. INS. 1. 1 HI6 . 43 . • . 4 . 0 . « , R . •,•.•> 

•«4«; 

•oiate** 


CALL PaSORTCUICS.AS.A > 

•test 

••••••to 

s 

CONTINUE 

• •4S! 

(■OSIIC04 

c 


• C4S! 


c 

iitrrtnfir r.tijnr of test pixel croufi wRitrEN to ritx 


rctr rtufL ctouM wRitfCM re ritx 
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»«ir 

CLMIC 





»«4!1 • 

•1I44« 

H • 

A ♦ 1 




«f49« • 

414444 

149 rNT<4» • 4 




»t49r « 

419444 

L«1(J> ■ 1 




•«4ia 4 

414444 

C 4IC4IIICIIT COVHT 

OP 

icrUrMco 

LAICLI 

•C441 4 

4I1444 

NA > 

AA-I 




•44Ct 4 

414444 

4CTURA 




4441* 4 

419444 

CR4 





tVMOI. 

n«4 








T*fl 

<T4UCTU«f 

ADDREII 


CLCtCC 



4UA40VTIMC 



•or 


iHTiet* 

A4AAT 


4-XI4 . 

1 

MIIIMX 


iNTceci 

tIAALI 

VAC 

4*«1l 


rtLtttzc 


INrC6lt»4 

1IA4LC 

VAC 

4*«ai 


rof«» 


INTCCC4 

rUACriON 



1 


INTtCtI 

AlAfLC 

VAC 

4*XP 


!• 


INTCetR 

tlAALt 

VAC 

4*UI 


insc 


INTtCCA 

AARAV 


4*X9 . 

1 

i« 


lATKtR 

ItArLI 

VAC 

4*xta 


4 


INTtCCI 

4IAfL< 

VAC 

4-xaa . 

1 

RN 


lATCGIt 

llArLf 

VAC 

4*xa4 


lA 


LOGICAL 

tlAALC 

VAC 

4«XI4 


It 


L44ICAL 

IIAPLC 

VAC 

4*XI t 


M6 



4IAPLC 

VAC 

4*X4 . 

I 

N» 


IATCCC4 

tIAPLC 

VAR 

4>xia . 

1 

ND9 


INTCCIR 

IIAPLC 

VAR 

4*X4 . 

1 

Ml 


INTCCCC 

IIAPLC 

VAC 

4-X14 . 

1 

tlKIlMO 


lATtCCt 

IIAPLC 

VAR 

4*XI1 


Tir 


LOCtCAL 

AACAT 


i-xat 

I 

I 


lAUCCI 

IIAPLC 

VAR 

4-xai . 

I 


ORIQSNAL PAGE IS 
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AAAC 

TYPE 

GTCUCTURC 

AG4RCII 

COMTCOL 

LOCItAL 

tlAPLC VAC 

C-l9 

4 1 

PCACtR 


SUIPevI INC 



riLCAO 

lATCGCC 

GIA*LC VAC 

4-XI4 

,1 

PHAAC 

CAACACTCR 

IIAPLC VAC 

4*xai 

.1 

P4CIU 


fUCROUri AC 



lA 

lATCCCR 

IIAPLC VAC 

4YXI4 


ICCR 

lATCGCR 

IIAPLC VAC 

4*XI4 


IA4 

lATCCEC 

ACCAT 

V-X4 

.1 

ITtP 

lATCGCR 

IIAPLC VAR 

U*XIP 


R 

lATCCCC 

tIAPLC VAR 

4*XI9 


CAT 

lATCGCC 

ARRAY 

0-XIP 

. 1 

LAI 

lATCGlC 

ARRAY 

4-xac 

, 1 

LTIP 

LOGICAL 

IIRPIC VAt 

4«X1P 


A 

lA.rCGCR 

IIAPLC VAfc 

l-XIl 

,1 

A» 

lATCGCR 

IIAPLC VAC 

4-Xll 

.1 

AL 

lATCGCR 

IIAPLC VAC 

a-xii 

. 1 

PAIOCT 


GUCCOUTIAC 



PCIAIP 


IHCC4UTI AC 



Utra 

lATCGCC 

ARRAY 

4-XP 

. 1 


Meci«n UNIT ctttcc commli* 
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Parent: NUMCLU 


COLAPS 


COL APS (MAX. ME AN. SUM. ND) 


In this subroutine, each vector In MEAN with corresponding Index In 
SUM zero Is eliminated. The calling program NUW:LU uses SUM to mark 
vectors In MEAN which are no longer In force. Classification Is more 
efficient when needless branches on SUM(.) ■ 0 are avoided. 

Method : The method Is as simple-minded and as Inefficient as a bubble 

sort. Any time SUM(.) = 0 Is encountered, m6ve the entire array down 
one slot. (But) It is sel f -documenting. 


Progra m Variable s 


I,J,K,MM,IP1 

MAX 

MEAN (NO, MAX) 
ND 

SUM(MAX) 


INTEGER DO loop parameters 
INTEGER Number of vectors In MEAN. 

INTEGER ARRAY The mean vectors, to he collapsed. 
INTEGER Dimensionality 

INTEGER ARRAY Pointer array; vectors In MEAN with 
SUM(.) = 0 should be eliminated. 
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POOR 


fa 

QUALiry 


P«6C 00(0 HE»LCTT-f ACKAHO 3ZI42I.41 .03 FORTRAN/3000 


TUE. OCT 13. IMl, 9(41 All 


AOOAC 

03092000 

OCONTNQL tEGRCNT>AN0E9ASEC 

AAAOl 

03093000 


tOttOUTINC COi.APSCNAl(,HCAN.SUN.NO) 

04004 

03094000 

C 


AOOOt 

03099000 

C 


00004 

03090009 

C 


fOOOf 

03097000 

c 

SUtROUTIHE COLAPS COES THRU THE CLUSTERS IN NEAR 

00004 

oooiaooo 

c 

ANO DELETES ANY CLUSTER UITH SUNLK) • 0. COHPRESSINC 

00004 

03199000 

c 

THE ARRAY HEAR 1 THIS ALLONS SLIGNTLT NONE EFFICIENT 

00004 

03100000 

c 

SEARCH WHEN TRYING TO CLASSIFY A POINT. 

00004 

03101000 


|HTECER>3 REAH(RD.RAX>.S0R(RAX) 

00004 

03102000 


RR ■ RAX-t 

00011 

03103000 


DO 1 I ■ 1. HR 

00014 

03104000 


IF CSUH( 1 ) RE . 0> GO TO 1 

00033 

03109000 


IPI • HI 

00034 

03104000 


IF (IM.EO.RAX) RETURN 

00033 

03107000 


DO 3 3 • IPl.NAX 

00037 

O3I9AOO0 


IF (SUH(3>.EO.O> CO TO 3 

00044 

03109000 


SURi 1 > « SORT 3 > 

00030 

03110000 


$UR(3> ■ 0 

00093 

03111040 


DO 3 X « I.RD 

00040 

03113090 


1 RCANIK.I) • REAN(K.3) 

00073 

02113000 


60 TO 1 

ooorr 

03114000 


3 CONTINUE 

OCIOO 

03113009 


RETURN 

OOIOl 

02II4000 


I CONTINUE 

00103 

02117000 


RETURN 

0C1O3 

03119000 


END 


tTAtOL NAP 


NANC 

TYPE 

STRUCTURE 

ADDRESS 

RARE 

TYPE 

structure 

ADDRESS 

COLAPS 


SUDROUTIHE 


I 

INTEGER 

SINPLE VAR 

0133 

IPI 

INTEGER 

SINPLE YAR 

0137 

3 

INTEGER 

SINPLE VAR 

9134 

E 

INTEGER 

SIRPLE VAR 

9134 

RAF 

INTEGER 

SIRPLE VAR 

9-37 .1 

NEAR 

INTEGER 

ARRAY 

9-34 . 1 

NH 

INTEGER 

SINPLE VAR 

9133 

HD 

INTEGER 

SINPLE VAR 

9-34 .1 

SUN 

INTEGER 

ARRAY 

9-33 ,1 


UNIT COlAPO COHPILEO 






’K l^iOt 
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Parent : START CONNCT 

CONNCT(FINISHED,Z,N,A1,A2,BUF,IN,LAB) 

This subroutine grows components from intervals. 

Method : Label line A1 contains either Z or patch labels; Z marks boundary. 

Line A2 contains Z (boundary) or interval marks. A2 is scanned; when an 
interval is found, A1 is examined looking for a patch label. If a 
patch label is found, it is saved in BUF. 

Then A2 is scanned again. An interval mark is replaced by the 
corresponding label in BUF. If none is found, the label counter LAB is 
incremented and LAB is stored as the new label. Labels begin at -32767 
and are allowed to grow to as much as 32767. On reaching 32767, flag 
FINISHED is set to .TRUE, so the calling program will know the supply 
of patches has exhausted the labels. 

"U" shaped components will not be found by this method. Rather, 
they will be pieced together as two different fields of labels. We do 
back up one line in loop 50, which sometimes removes single element 
patches. 


Program Variables 

A1(N),A2(N) INTEGER ARRAY A1 is the elder line with patches 

already marked. A2 is the new line of intervals to 
be turned into patch labels. (The first patches 
are created in the calling program.) 

BUF(IN) INTEGER ARRAY Used to stash labels to be transfered 

to intervals. 


FINISHED LOGICAL When we run out of labels, FINISHED is set 

to .TRUE. The calling program uses this flag to 
terminate processing (gathering patches). 

I,K INTEGER DO loop index. 


PRECEDING PAGE BLANK NOT RLMED 
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CONNCT-2 


IN 

INTEGER 

IS, IT 

INTEGER 

LAB 

INTEGER 

N 

INTEGER 

Z 

INTEGER 


Number of intervals in A2. 
Temporary labels. 

Current label pointer. 

Number of elements in a line. 
Boundary marker, -32768. 
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ORIGINAL PAGE tS 
OF POOR QUALITY 


I 

1 


race NC«LETT-P*CKIirS 32l«M.»t «l F0ITI»N/l*t« TUC. OCT l>. 1901, 9ll« Ml 


•OOOT 011(9000 (COMTtOL «ECRCHT> AMOEB A9EC 

oooor otiroooo (UIROUTINE CONHCT(MMI8HEO,Z.H.A1.A2.IUF.IN.LAI) 

OOOOF OtirtOOO LOCICAL finisnep 

oooor OttF2000 tHTE6EA*2 Z . At ( H >. A2( H > . lOFUN) 

OOOOF Otl'3000 00 to K • t.IH 

OOOtO OttFOOOO to OUFCK> • Z 

00020 OttFSOOO 00 20 I ■ t.N 

00029 OttFOOOO IT • A2( I > 

00010 OIIFFOOO IF (IT.CO.Z) CO TO 20 

00010 OttFOOOO It * AI(I> 

OOOIF OIIF9000 IF (IS.CO.Zt 60 TO 20 

00002 01 100000 OUrCIT) ■ IS 

OOOOt OtIIIOOO 20 COMTIHOe 

OOOOF 011(2000 00 10 K ■ t.IH 

00094 OttOSOOO IF ( B0F( R >. IE . 2) 60 TO 30 

000(1 Ot 1040(0 LAI ■ LA|*I 

000(2 011(9000 IF (LAI.E0.32F(F> 60 TO (0 

OOOFO 0110(000 lUFCR) ■ LAO 

OCOF3 OIIOFOOO 10 COMTIHUE 

OOOFO OltOOOOO C 

OOOFO 011(9000 C HQH TRAHSFEO ACTUAL LAOELt 

OOOFO OltOOOOO 00 00 I « t.N 

OOtOt OttOIOOO IT ■ A2( I ) 

OOtOO 01192000 IF CIT.ME.Z) A2( I > ■ OOFIIT) 

OOlll 01191000 00 CORTIRUC 

00110 01194000 e 

OOltO 01199000 C MACK UP AND CLEAR OF (IN&LETOHI (HAYOE) 

OOtIO 0119(000 00 90 I ■ t.N 

00121 OIIOFOOO IT • AZ(I> 

00120 0119(000 IF (IT.EO.Z) 60 TO 30 

00130 01199000 IF (AKD.RE.Z) AKI) ■ IT 

OOtlF 01200000 30 CORTIRUC 

00100 01201000 t E T U R H 

00141 01302000 (0 FIRISRCO • TRUE. 

00101 01201000 R C T U R R 

00104 0I204090 EMO 


tTROOL RAF 


HARE 

TTFE 

STRUCTURE 

AOORESS 


RARE 

TYRE 

STRUCTURE 

AOORC 

At 

IHTE6CR 

ARRAY 


(•no , 

I 

A2 

IRTCCER 

ARRAY 

0-27 

(UF 

IRTECCR 

ARRAY 


o-ts 

1 

CORRCT 


tUIROUTIHE 


FIRISFCO 

L06ICAL 

SIRFLE 

VAR 

(•RI3 . 

I 

t 

IRTCCER 

SIRFLE VAR 

(423 

IK 

INrCCCR 

SIRFLE 

VAR 

0-»3 , 

1 

It 

IRTCCER 

SIRFLE VAR 

(420 

IT 

IRTECCR 

SIRFLE 

VAR 

• 429 


R 

IKTCCER 

SIRFLE VAR 

(42( 

LAI 

IRTCCER 

SIRFLE 

VAR 

(-24 

I 

H 

IRTCCER 

SIRFLE VAR 

0-211 

Z 

IRTCCER 

SIRFLE 

VAR 

0-212 . 

I 
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Parent : NUMCLU DIAMTR 

DIAMTR(MEAN,ND.ITL.IDIAM) 

This routine determines the square of the diameter of the vectors in 
array MEAN. 

Method : Except for the bias of -32768, the method is self -documenting. 

The biased squared distance from MEAN(.,I) to MEAN(.,J), with 
1 < I < J 5 ITL, is computed, and the maximum such value is returned as 
IDIAM. 


Program Variables 


1,0, K, IP, ITEM 

IDIAM 

ITL 

MEAN (ND, ITL) 


INTEGER DO loop parameters. 

INTEGER Square of diameter of vectors in MEAN. 

INTEGER The total number of vectors in MEAN. 

INTEGER ARRAY The array whose diameter is to be 
determined. 


PRECEDING PAGE BLANK NOT FILMED 
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MCC •«9f HCMLETT-rACtlAlB 33102t OI.)3 FOtTMH/IOtO TUC. OCT 11. 1901. 9l4l OH 


•OOOO O290T0OO (CONTtOL (CCHCHTaOROCOOSCC 

••O')! O1900900 tUOROUTIHC DIAHTHCHEON.NO. ITL.IOIOR) 

•••to oittotto c 

••••0 •Itrtttt C f Off NT PIOCOOHI NURCLO 

••••0 titnttt c 

••••0 tltTOtt* C lUOtOOTIHC DIOHTR PIHOt THE OOOORE OF THE OIAHETEO 

••••0 •ItFIttt C OF THE OET OF CLVOTCH CENTCRO IN REOH 

••••0 •2«F4ttt tNrt6ER*2 HEOHINO.ITLI 

••••9 024F9040 ITCH ■ ITL-t 

•••!• tlorttOt lOtON • -12F«0 

•••IT 02«rrt*t 00 t I ■ I.ITLR 

•••24 •24F0940 IF - !♦! 

•••2T 02«r9»«t 00 I 2 ■ IF.ITL 

•••14 tatotttt c 

•••14 •2401444 C FIND THE CISTOHCE **2 OETOEEN HEOH< . , I > OHO NE0N(..2> 

•••14 tatOlAtt IS • -12FS7 

•••!( tlOOlOtO 00 2 K ■ I.NO 

•••41 tltOtttt IF C II .CC. IC144) GO TO 1 

•••04 «2409«44 IT > HEONl K . I >-HCON< K . 4 > 

•••M OltOltCt 2 II - II«IT*IT 

•OOFS 0290F»»« IF CIS.LE.IDIONI GO TO I 

••102 •29M000 IDtOR • It 

•0104 OlOiyOOt I CONTINUE 

•OIOI •2t90t»0 RETURN 

• OUT 0249lt00 ENO 


ITROOL ROF 


ROHE 

TTFE 

ST*'UCTURE 

ODOREtS 

NONE 

TtfE 

STRUCTURE 


• I ONTO 


iUIROUTIHE 


1 

INTEGER 

IIRFLE VOR 

0*21 

lOIOR 

INTEGER 

SIRFLE VOR 

0-24 ,I 

IF 

INTEGER 

IIRFLE VOR 

0*24 

II 

INTEGER 

SIHFIE VOR 

0«2C 

IT 

INTEGER 

IIRFLE VOR 

0*2F 

ITL 

INTEGER 

SIRFLE VOR 

0-23 .1 

ITLH 

INTEGER 

SIRFLE VOR 

0*2I« 

4 

INTEGER 

IIRFLE VAR 

0*29 

K 

INTEGER 

IIRFLE VOR 

0*211 

RE ON 

INTEGER 

ORROT 

0-2F ,I 

HO 

INTEGER 

SIRFLE VOR 

0-2« 


FROGRAR UNIT OlOHTR CORFItCO 
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Parents: START, MARKUPDN, MARKLR 


FILLLR 


FILLLR(Z,LAB,N) 

This subroutine works along a line of labels and replaces a label with 
Its two neighbors marked boundary with the label boundary. It, so to 
speak, fills inleft-right cracks In the boundary map. 

Method ; Only one tricky point Is involved. LL, LR, and LM are used 
to save labels down a line. This minimizes Indexing while preventing 
propogatlon of boundary down lines. As usual, Z * -32768 Is used to 
mark boundary. 


Program Variables 


LAB(N) 

LL,LM,LR 


INTEGER DO loop index 
INTEGER Pointer to current position. 
INTEGER ARRAY The labels being processed. 
INTEGER Labels down the line. 

INTEGER Number of labels. 

INTEGER Boundary: -32768 
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STRUCTURE 

ADDRESS 
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SIARLC VAR 
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lATCCCR 
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Parent : CLASSIFY FIXUP 

FI XUP( LABELS ,NC . IT , 12 , I 3 . PI XELS .REJECT, CLUSTERS ,ND .MAXCLUS ) 

This subroutine follows MARKUP and attempts reclassification of a pixel 
classified unlike each of its four neighbors. The attempt fails when 
each nieghbor is too far from the pixel to stand that reclassification. 

Method I As in MARKUP, a circular buffer LABELS is maintained. II points 
to the eldest, 12 to the current (which may have been modified by the 
addition of 101), and 13 to the nearest (unchecked) line of labels. 

We regard the classifications in line II as final, those in 13 as 
tentative. Those in 12 marked over 100 are fixed by subtraction of the 
101 added by FIXUP. Of the rest, note they are classed unlike any 
neighbor. Collect, from the four neighbors, the classes of each 
neighbor (a) in line II, (b) in line 12 and marked, or (c) in line 13 and 
like at least one neighbor in that line. This gives up to four classes. 
Reclassify the pixel in the nearest unrejected class of those up to four 
classes. If no unrejected class exists, leave the classification alone. 
All distances are biased by -32768. 


Program Variables 
CLUSTERS (ND. MAXCLUS) 

F0UR(4) 

I 

II 
12 
13 
IM 
IS 


INTEGER ARRAY The cluster cneters. 

INTEGER ARRAY Used to store the (up to) four 
classifications of OK neighbors. 

INTEGER DO loop index. 

INTEGER Pointer to eldest label line. 

INTEGER Pointer to current label line. 

INTEGER Pointer to newest label line. 

INTEGER I-l; pointer to current slot. 

INTEGER Sum accumulator for di seance. 
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FIXUP-2 


ITT 

J 

J4 

K 

L 

LL 

LM 

LR 

MAXCLUS 

NC 

NO 

NDIST 

NRST 

PIXELS(NC.ND) 
REJECT( MAXCLUS) 


INTEGER Scratch variable. 

INTEGER Index Into FOUR: DO loop Index. 

INTEGER Value of FOUR(J) in loop. 

INTEGER DO loop Index. 

INTEGER DO loop Index. 

INTEGER Label on left. 

INTEGER Label In middle. 

INTEGER Label on right. 

INTEGER Dimension for CLUSTERS. 

INTEGER Number of samples per line. 

INTEGER Dimensionality 

INTEGER Distance from nearest neighbor classifier, 
pixel to nearest of up to four. 

INTEGER Index of nearest, or zero If all are 
too far away. 

INTEGER ARRAY One line of data along line 12. 
INTEGER ARRAY Rejection thresholds. 
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Parent: THINTSTM 6ETN25 

GETN25 ( FRSTFLG .TTP .N25 , NDS . FI LEND ,UI CB , I NO ) 

This program fetches N25 test sets from a disk file written by the 
subroutine CLOSEC. 


Method : On the first call to the subroutine the disk file is rewound. 

On each call, N25 test sets are read from the file. A test set is 
five test pixels. The file is job-temporary. 


Program Variables 


FCHECK 

FCONTROL 

FILENO 

FREAD 

FRSTFLG 

I 

I CALL 

I ERR 

IMSG 

I NO 

MSG 

N25 

NDS 

PABORT 

PRIHTP 

TTP 

UICB 

ZERO 


SYSTEM INTRINSIC Error checking 
SYSTEM “ Used to rewind file 

INTEGER System file number 
SYSTEM INTRINSIC File read 
LOGICAL Marks first call 
INTEGER DO loop index 

" Number of words returned by read 
" Stuffed with error code 

ARRAY Message to be printed 
INTEGER ARRAY Error code location 
CHARACTER Message to be printed 
INTEGER Number of test sets requested 
" 5 times number of dimensions 

SYSTEM SUBROUTINE 

" " Prints a message 

LOGICAL ARRAi' Sutffed with test sets 
INTEGER ARRAY User information control blcok 
LOGICAL Dummy argument for FCONTROL 
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I 

INTECCR 

SIRRLE VAR 

0«2S 

ICALL 

INTEGER 

SINPLE VAR 

0*17 

lERF 

INTEGER 

SlnRLE VAR 

04 It 

INSC 

INTEGER 

ARRAY 

0*13 

INP 

INTEGER 

ARRAY 

P-X4 ,I 

NSC 

CHARACTER 

SIHPLE VAR 

0*14 

M2S 

INTEGER 

SINFLE VAR 

0-110 ,I 

NDS 

integer 

SINPLE VAR 

0-17 .i 

FASORI 


SUBROUTINE 


'RINTR 


SUBROUTINE 


TTF 

LOGICAL 

ARRAY 

P-lll ,1 

UlCB 

INTEGER 

ARRAY 

0-19 .1 

lelio 

LOGICAL 

STNELE VAR 

OHIO 
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PARENT: MAPP 


IIIFN 


FUNCTION IIIFN(I) 

Naps -32768 to 32767 into 1 to 59 so that: -32768, 0, and 99 all map 
into 0; other integers I map to numbers between 2 and 59. This function 
is used by MAPP to index into SYMBOL. Special numbers -32768, 0, and 
99 are all mapped to print as blanks. Others print as symbols so that 
when I and J are close, the synrt>ols are different. 

Method : Self- documenting. 


Program Variables 

I INTEGER Function argument 


73 




«9t3 


HCHLETT-MCKACb 34l02«.9t.«3 


F0ATRAN/3«»9 


THE. OCT 13. 1901. 9i3T AO 


««p9: 

004POOO'« 

ICOHTPOL SCCMENt>AN0C»A9EC 

• «0C4 

00400000 


FONCiinM iiifn:i> 

•«0C t 

004*1000 


IF < I CT 0) CQ TO -1 

00««S 

00492000 


IF \ I . EO 0 OR I . EO -327*9> 

• «022 

0049 JOOO 


t • -I 

««029 

00494000 


CO 10 9 

• C02S 

O04*SO0O 

3 

IF (I E3 99 > CO TO 4 

»«ot: 

00400000 

9 

I • HOC-(I.EO> 

ROOTS 

00497000 


IF ( I . LE 1 > I • 1*30 

00044 

00 498000 


IIIFN * 1 

0C04S 

00499000 


RETURN 

0C04 ' 

00490000 

4 

IIIFN • I 

OtiO? 1 

00491000 


RETURN 

otoss 

00492000 


ENO 


SVAtCL NAP 

TYPE STRUCTOtE 

IMTECER SIMPLE VAR 


AAMC 


1 I IPN 


TYPE 

IMTECER 

IMTECER 


STRUCTURE 


SIMPLE VAR 
FUNCTIOH 


AOPRESS 


0-R4 .1 


NAME 
I IIFN 


PPOCRAM UNIT IIIFN CQMPILEr 


AOORESS 

0-X9 


PRECEDING PAGE BLANK NOT FILMED 


75 


Parent : r^AIN MAPP 

Calls : IIIFN 

MAPP(N1.N2,N3.UICB,IND.IMAGE,NR,NC.SYMB0L) 

Produces a quick look at a segment of data, labels map, or cluster 
map. Output is directed to the default device. This program is intended 
for debugging. 

Method : The subroutine prints N2 lines in band one of an image, starting 

at line N1 sample N3. Because of the limitations of PRINTP, only 64 
samples can be printed. Line numbers are printed, but rot column numbers. 

Program Variables 


CHKIO 

SYSTEM SUBROUTINE 

IC 

INTEGER 

DO loop index 

IIIFN(IV) 

INTEGER FUNCTION 

IMAGE 

INTEGER 

Image number. 

IND(l) 

INTEGER ARRAY Error information. 

IPX(36) 

INTEGER ARRAY Dummy array to compensate for the 
inadquacies of PRINTP. Equivalenced to PICTURE. 

IP. 

INTEGER 

DO loop index. 

N1 

INTEGER 

Starting line number. 

N1PN2 

INTEGER 

Last line number to print. 

N2 

INTEGER 

Number of lines 

N3 

INTEGER 

Starting sample number 

NC 

INTEGER 

Number of samples in image. 

NCL 

INTEGER 

Last sample to be printed. 

NCP 

INTEGER 

Actual number printed. 




76 


NCP8 

NR 

PICTURE 

PRINTP 
READP 
SCAN (64) 
SYMBOL (59) 

UICB 


MAPP-2 


INTEGER NCP+8, the number sent to PRINTP. 

INTEGER Number of lines in image. 

CHARACTER*72 area for core-to-core write under FORMAT 
control. 

SYSTEM SUBROUTINE 
SYSTEM SUBROUTINE 

INTEGER ARRAY Array to read from image. 

CHARACTER*! ARRAY The symbols printed; created by 
SETSYM. 

INTEGER ARRAY 
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Parent: START MARKLR 

Calls : FILLLR 

MARKLR(Z.DATA,INTTHR,NC,ND,LABELS,MASK) 


This subroutine marks boundary points left- right. 

Method : A line of data is scanned. Points in the mask are marked 
boundary (under control of MASK). Points which, in any band, are closer 
to their left neighbor than INTTHR(.) are marked boundary. Finally, subroutine 
FILLLR is used to fill in a single point gap along the line. 


Program Variables 
DATA(NC,ND) 

FILLLR 

I.K 

IM 

INTTHR(ND) 

LABELS (NC) 

MASK 


INTEGER ARRAY One line of data 
SUBROUTINE Fills in gaps along a line. 

INTEGER DO loop index 
INTEGER I-l 

INTEGER ARRAY The thresholds for boundary finding. 

INTEGER ARRAY A line of labels 

LOGICAL The mask flag. When .TRUE., zero in band 
1 marks a point off the image, i.e., masked. 


NC 

ND 

Z 


INTEGER Number of samples per line. 

INTEGER Dimensionality 

INTEGER -32768, passed as a parameter, and used to 
mark boundary points. 
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Parent : CLASSIFY MARKUP 

MARKUP ( LABELS. NC, 11,12, 13) 

This subroutine "marks" pixels classified like at least one neighbor. 

This follows a nearest neighbor classification by PERPIXEL, and is followed 
by a possible reclassification of errant pixels by FIXUP. 

Method : The pixel labels reside in a circular buffer LABELS of NC labels 
per line II points to the eldest label, 12 to the current label, and 13 
to the newest. The eldest label may have been spatially nx)d1fied, but 
labels along scan line 12 are not propagated down that scan line. The 
number 101 is added to any label classified like at least one of the four 
neighbors along scan line 12. 

’^ rogram Variables 

I 

II 
12 
13 
IM 

LABELS {NC. 3) 

LL 
LM 
RL 
NC 




INTEGER DO loop index 
INTEGER Eldest line pointer. 

INTEGER Current line pointer. 

INTEGER Newest line pointer. 

INTEGER I-l 

INTEGER ARRAY The labels; LABELS(.,I2) is modified 
by adding 101 when at least one neighbor has the same 
label . 

INTEGER Label on left. 

INTEGER Label in middle. 

INTEGER Label on right 

INTEGER Number of samples per line. 
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Parent: START MARKUPDN 

Calls : FILLLR 

MARKUPDN(Z,D1 ,D2,L1 ,L2.L3.NC.ND,INnHR,MASK) 

This subroutine marks boundary points up-down. 

METHOD : If, in any band, data in D1 is closer to or equal to data in D2 

than the threshold INTTHR, then labels LI and LI and L2 at that sample are 
marked by being set equal to Z. The reason for the "less than or equal to" 
rather than "less than" as appears in MARKLR is that the data is generally 
less variable down scan lines than across scan lines. Points off the image 
are also marked as boundary provided MASK is set. At conclusion of this 
scan through the data, FILLLR is called on center line L2. 


INTEGER ARRAY Two adjacent lines of data, D1 is the 
eldest, D2 newer. 

SUBROUTINE Fills in gaps along a line. 

INTEGER DO loop index. 

INTEGER ARRAY Vector thresholds for deciding boundary. 

INTEGER ARRAY Three scan lines of labels. LI is 
oldest, L3 newest. 

LOGICAL Flag used to decide whether a value of 0 in 
Dl(.,l) or D2(.,l) marks the mask. Mask points are 
marked boundary. 

Number of samples per line 

Dimensional ity 

-32768, used to mark boundary. 


Program Variables 

D1(NC,ND) 

D2(NC,ND) 

Fl LLLR 
I,K 

INTTHR(ND) 

Ll(NC) 

L2(NC) 

L3(NC) 

MASK 

NC 

NO 
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Parent : MAIN MOREQUES 

Calls : REJECTH 

MOREQUES ( MEANS . TESTS .MAXCLUS , NFCLUS . ND . NTS . REJECT ) 

The purpose of MOREQUES is to detect when a necessary cluster has been lost 
and add it in. The test is based on the pure rejection thresholds applied to 
classification of the center (i.e., third of 5) test pixel in each test 
set. 


Method: Initially, REJECTH is called to determine the pure rejection 
thresholds. Then a loop on center of test sets is entered. Each time no 
cluster center is closer than its rejection threshold to the test pixel, 
that test pixel is added as a new cluster center, the reject thresholds are 
calculated, and the next test set examined. On completion of all examina- 
tions, the reject thresholds are multiplied by 2 (effectively multiplying 
the Euclidean distance test by preparing for misregistration mixtures). 
Since there is a bias of -32768, the actual calculation goes as follows: 

If r is the unbiased rejection threshold and R the biased, then 

r = R + 32768, so the new unbiased threshold 2r has its biased threshold 

R' = 2r - 32768 = 2R + 32768. (If R' is greater than 16000, use R' = 16000.) 


Program Variables 
AO 

I,J,K 
IS, IT 
MAXCLUS 

MEANS(ND, MAXCLUS) 
ND 

NFCLUS 

NTS 


INTE6ERM Long integer used to perform long calculations 

INTEGER DO loop parameters 

INTEGER Used in accumulating distance. 

INTEGER Maximum number of clusters. 

INTEGER ARRAY The cluster centers. 

INTEGER Dimensionality 

INTEGER Current number of clusters. 

INTEGER Number of test sets. 
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MOREQUES-2 


REJFCT(MAXCLUS) 
REJECTH 
TESTS (ND, NTS) 


INTEGER ARRAY Rejection thresholds. 
SUBROUTINE Calculates REJECT. 
INTEGER ARRAY Test pixels. 
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Parent : START MKKIVL 

MRKIVL{Z.A,N,M) 


This subroutine prepares the conplement of the boundery for accuniulation 
of components. 


Method : Recall that boundary points are marked with Z (-32768). The 
rest are counted left-to-right along line A in intervals, replacing the 
slot in A by the interval nianber. For example, a 20 point line on input 
might be 

ZZOOOZZZZOZOOZOZZZOO 
and the line returned would be 

ZZ111ZZZZ2Z33Z4ZZZ55. 

On return, M is the number of intervals found. 


Program Variabl es 
A(N) 

I 

IM 

M 

N 

Z 


INTEGER ARRAY One line of boundary labels, in which 
Intervals are to be found. 

INTEGER DO loop index 

INTEGER I-l 

INTEGER Interval counter. 

INTEGER Number in a line 
INTEGER Boundary marker. 
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Parent : MAIN MSORT 

Calls : SHELL 

MSORT{MEAN.ND,:iFC.DUM.INDX) 

Sorts final clusters by average odd channels to aid interpretation of 
clustering. 

Method : First the sums are accumulated. At the same time, an index is 

set so that INDX(I) = I. Then SHELL is called. On return from SHELL, 
the means are reordered by the permutation of INDX. The actual means are 
now switched in place. 


Program Variables 
DUM(NFC) 

I 

INDX(NFC) 

J 

K 

MEAN(ND,NFC) 

ND 

NFC 

SHELL 


INTEGER ARRAY Used to accumulate sums in odd bands, 
and then as temporary storage to switch MEAN. 

INTEGER DO loop index 

INTEGER ARRAY The pointer array, used by SHELL to 
indicate actual order of DUh. 

INTEGER DO loop index. 

INTEGER DO loop index. 

INTEGER ARRAY The means to be sorted. 

INTEGER Dimensionality of MEAN. 

INTEGER Number of vectors in MEAN. 

SUBROUTINE Sorts vector in increasing order. 


) 
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Parent : MAIN NUMCLU 

Calls : COLAPS, DIAMTR, UNCLE 

NUMCLU(MEAN,ND,NP5.NP,TSPXL,NFCLUS.MINCLN,MAXCLN.CLASS,C0UNT.ERR0R,SAVE. 

DUM.CSAVE,UICB,IND,NUM) 

NUMCLU is the main clustering segment in AMOEBA. It carries out cluster 
formation ar:ording to the strategy suggested by the model (Appendix A). 

Method: On entry to NUMCLU, the set of all possible means is MEAN, and 
the test sets are in T5PXL. Also furnished is MINCLN (which, if negative, 
requests exactly MINCLN (too many) clust^rs, or if positive at least 
MINCLN, and MAXCLN (which specifier :tbe n.aximum number of clusters to seek). 
The following steps are carried out; 

(1) Classify the first and last of each test set in the nearest 
cluster (Euclidian distance). Save the classification in CLASS 
and count it in COUNT. Set LIVING = the initial number NP5 of 
clusters. 

(2) For each I, if COUNT(I) = 0, eliminate that cluster by setting 
NUM(I) = 0 and LIVING = LIVING -1. 

(3) If fewer than 101 clusters are present, go to step (6), else set 
IF = 1. 

(4) Eliminate successively each cluster I with COUNT(I) = IE; 
reclassify test pixels assigned to eliminated classes; exit (G) 
when the number of viable clusters falls below 101. 

(5) Increment IE and repeat (4). 

(6) Call COLAPS to remove eliminated centers. 

(7) Call DIAMTR to determine the diameter of starting clusters. 

(8) Set the initial elimination protectior threshold IDIAMP. Except 
for the bias, IDIAMP is the diameter divided by 4*MINCLN. 

(9) Classify each test pixel by nearest neig'"bor and save the 
classification. 
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NUMCLU-2 


(10) Set NERR, an error counter, to zero; also set ERROR(I) = 0, I = 1, 
....LIVING. 

(11) Count the number of times a cluster attracts a pixel from a test 
set and a pixel from a “far away" test set (in the order inplied 
by the pre-sorting of test sets). Also accumulate in ERROR the 
count of this event per cluster. 

(12) Count the number of times a pair from the same test set is split, 
and credit to each of tne cli-^ters by incrementing ERROR of each. 

(13) Save the running minimum of NERR in MIN provided the current 
number of clusters is not greater than MAXCLN. 

(14) Determine which cluster I has ERROR(I) maximum. 

(15) Tentatively reassign test pixels assigned to class I; however, 

if any test pixel J is assigned more than IDIAMP away, execute (17). 

(16) Now dummy eliminate class I, set IDIAMP = IDIAMP-ND, and decrement 
the running number of clusters. If this number is less than or 
equal to MINCLN, execute (18). (Similar logic implements 
exactly so many clusters .) Otherwise execute step (10). 

(17) The biased distance is ID; set IDIAMP = (IDIAMP/3)*2 + ID/3 + ND 
and replace the mean in question by the (far away) test pixel. 
Replace the error counter here by half its former value and 
repeat step (14). 

(18) Now actually eliminate the clusters to the minimum NERR (the 
earlier eliminations were only dummies), and again call COLAPS to 
move means to the beginning of MEAN. The number of clusters is 
now NFCLUS. Exit. 


Program Variables 
CHKIO 

CLASS (NP) 

COLAPS 

C0UNT(NP5) 


SYSTEM SUBROUTINE 

INTEGER ARRAY The class a test pixel is nearest. 

SUBROUTINE Moves the vectors in MEAN to start, 
eliminating gaps. 

INTEGER ARRAY. The number of classifications a mean 
receives. 
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CSAVE(NP) 

DIAMTR 

DUM(NP5) 

ERROR (NP5) 

I. J.K. 

II, 12,13,14.15 

IC 

ID 

IDIAM 

IDIAMP 

IE 

IN 

INDO) 

I PMC 

IPRINT 

IT 

LIVING 


NUMCLU-3 

INTEGER ARRAY Scratch array used to save classifications 
while checking distances. 

SUBROUTINE Finds biased square of diameter of starting 
clusters. 

INTERGER ARRAY Dunmy, used to mark which are eliminated 
while computing the minimum number of errors. 

INTEGER ARRAY Nufrtier of errors for each cluster center; 
see above. 

INTEGER DO loop index. 

INTEGER The classification of the first through fifth 
of each test set pixel. 

INTEGER The classification of a test pixel far away. 

INTEGER The biased distance returned by UNCLE. 

INTEGER Square of diameter (biased) of starting 
cluster cent'irs. 

INTEGER Elimination protection threshold (biased by 
-32768). 

INTEGER Number of classifications to eliminate (a 
DO loop index). 

INTEGER Running current number of clusters being 
tested. 

INTEGER ARRAY Error indicator. 

INTEGER Estimate of PPMC (see Appendix A). 

INTEGER IDIAM with bias removed. 

INTEGER Far away index. 

INTEGER Current number of living clusters. 
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MAXCLN 

MEAN(ND,NP5) 

MIN 

MINCLN 

MJ 

MP 

NO 

NERR 

NFCLUS 

NP 

NP5 

NPA 

NPA2 

NUM{NP5) 

PRINTEQ(66) 

PRINTIT 

PRINTP 

SAVE(NP5) 

SEEK 

TSPXL{ND,NP) 

UICB(l) 

UNCLE 


NUMCLU-4 


INTEGER Maximum number of clusters sought. 

INTEGER ARRAY The cluster centers. 

INTEGER Running minimum number of errors found. 

INTEGER Minimum number of clusters sought. 

INTEGER The class with most errors during elimination. 
INTEGER Error count for seeking maximum number of errors. 
INTEGER Dimensionality 

INTEGER Local count of number of errors each trial. 

INTEGER Final number of clusters. 

INTEGER Number of test pixels. 

INTEGER Number of means. 

INTEGER NP/4; 1/4 of the way through the set of test 
pixels. 

INTEGER NPA*2; 1/2 of the way. 

INTEGER ARRAY Indicator to point to classes eliminated. 

Used at first and last of the program. 

INTEGER ARRAY Used for PRINTP I/O. 

CHARACTER Used for core-to-core formated write. 

SYSTEM SUBROUTINE 

INTEGER ARRAY Used to save classes eliminated (tentatively) 
in order. 

LOGICAL Used to carry cut Icgic for exactly so many 
clusters (if .TRUE.). 

INTEGER ARRAY The test pixels. 

INTEGER ARRAY User Information Control Block. 

SUBROUTINE Finds closest MEAN with NUM / 0 to TSPXL; 
distance is ID, class is II. 
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01499000 
01 494090 
01497090 
01499000 
01(99000 
01449999 
014(1090 
014(2090 
91413090 
01 4(4000 
01449090 
014(1000 
01447900 
OKOtOOO 
01449000 
01479900 
01(71090 
01472009 
01(73099 
01 474000 
01479000 
01474000 
01 *7700 > 
01471000 
01 (79000 
01(80000 
01(91000 
01 482000 
01483000 
01484000 
01499000 
01 (94000 
0148700? 
01(9300'.' 
01(99000 
01490000 
01(91900 
01(92000 
01 493000 


(COHTItOL 5ECRCHT«IMI0CMSCC 

9UftOUT I HE HUfICLUl HERN. ND. NP9.N9 . TtPXL. HFCLUt. HINCLN. 

• CLR$t.COaHr.e*ltOR.8AVC,»UII.Ctll9C.UICt.lH8.NUH.IIMCLII> 


c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

c 

t 

c 

c 

c 

c 

t 

c 

c 

c 

c 

c 

t 

c 


PARENT PROCRNNI RAIN 

DAUGHTER PROCRAMSi UNCLE. COLAPS. MAHTR 

-•>PARAHETERS IN NUHCLU--- 


CLAS9 — INTEGER42 VECTOR USEO TO STORE THE CLUSTER 
TO HHICH A TEST PIXEL (IN TSPXL ) IS 
ASSIGNED. 

COUNT -- INTEGEROZ VECTOR USED TO TALLV THE NUHSER 
OP TINES A TEST PIXEL IS ASSIGNED TO 
EACH CLASS ( IN NEAH > 

ERROR -- INTCGCR42 VECTOR USED TO ESTIHATE TNE 
RCLATMC PCRFORNANCE OF EACH CLUSTER 
VS ALL TNC OTHCtS. NOTE 

CRRORIO IS) INCREASED OY I HNEN 

CLUSTER E IS INVOLVED IN SPLITTING 
A PAIR FRON THE SANE PATCNl 

INCREASED RY I UHEH 

CLUSTER K ATTRACTS A PAIR FRON 

OIFFEREHT REAL CLASSES 

AT EACH ELiniNATtON CYCLE. THE CLUSTER HITH ERROR RELATIVELY 
LARGEST IS ELINIHATED) TNC AFFECT OF THIS IS 
that a CLUSTER SURVIVES IF IT 


(A> DOES NOT SPLIT PAIRS 
FRON THE SANE PATCH 

(B> DOES SPLIT PAIRS FROH 
DIFFERENT PATCHES. 

SAVE -- IHTECCR42 VECTOR USED TO STORE THE ORDER 
IH HHICH CLUSTERS ARC TO BE CLINIHATCD 
ONCE their HUHRCR IS DETERHINED. 

DUN -- INTCGCR42 VECTOR USER TO NARK CLUSTERS 
HHICH here CLINIHATCD IN THE INITIAL 
EliniHATION CYCLE PRIOR TO THEIR 
ACTUAL CLIHINATION 

CSAVE -- TEHPORCRY VECTOR USER TO SAVE CERTAIN CLASSIFICATIONS 
WHILE ^HE entice set IS BEING CHECKED FOR ACCURACY 

uIVIh; -- rflTA.ETER r.AF.KlM'; THE [UKREI4T HUNRER 
OF MUSTERS 


tER» - the HOnOEP OF TEST PIXEL PAIRS HHICH 

AfE split plus the hoahep or different 

FAINS WHI'.H ARE lOlHED PUtIHC A CYCLE 
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PACE ««SI NVnCLU 

I 

•«a« «u«4««« c 

( niACu* -• rNE niNiHKn NuntCK or ciusrcRf to«6HT 
•oait oittEooo c ir hinclm is HECArivc. exactly -aincln 

•«as( «t«froO« C CLUSTERS ARE SCU6HT (RATHER THAHCE.) 

soass otsssoos c 

0«ass AtSSSOSe C RAXCLN -- THE NAXIRUn NURRER OF CLUSTERS SOUSNT (IN NUNtlU) 

Coast OITOCOCO C in the current VERSION. RAXCLH > Si. 

coats circtccc c although it coulo easily se race a rrograr rarareter 

coast ciTcaccc c 

coast oirosooo c stcr •> h flac to shitch oetuien rxA.:rLT and ce 

coast OITC4COC c rincln clusters 

coass ciToscco c 

coats oircscoo c idiax -• the square of the diareter of the starting cluster set 

ccass CIFC7CCO c 

coast 017COCCO c idiarr -- an elirihatioh rrotection threshold hnich rrevents 

coast oiroscoc c the loss of clusters needed for the reasonasle class- 

coass OIFIOCOC C IFICATION of sore test RtXfLS 

ccass ciTiicoo c 

coast eiRiacco c rin •• running rihihuh nuroer of errors 

coast cirisoco c 

ooast OIM4CCC C NIA -- NURCER OF CLUSTERS Ul TN RIN ERRORS 

coast ciMseco c 

coast OITISOOC tHlC6ER*a TSRXL( HO. NR >.NEAN( HD. RRS >. NUH(NRS >. CLASSTNR ) 

ocacs cmrcoo • .COUHT(HRS>.ftROI(NRS>.SRVC(HRS>.»UH(NRS).CSRVt(NR> 

ocass CITttOOO • .RRINTEO(SS>.UICO( I ). INOI 4} 

ocass ciMsooo logicrl seek 

coais ctraocoo CNARAtTCR*ra rrihtit 

coats ciraiono couivalchckrrihtit.rrinteo) 

ooast ciRaacoo hirclr ■ rincln 

coaro CI721CCC eo iiti i ■ i.hrs 

coars ciTacccc iiii nurcd ■ t 

C030I ciTasocc c 

00301 OIR2SOCO t this SCCNENT FOLLOHS ORDERS RE HON RANT CLUSTERS ARE REQUIRED 

cosot ciraacoo seek ■.frlse. 

C030; CIF20C0C IF (HIRCLR. LT.D> GO TO 04 

00307 OiratODO if (RINCLR.lt a . or rincln GT 100> hihcln • s 

00317 DI7300CC »7 UR T T E( RR I NT I T . Nt ) RINCLN 

CC340 C173I900 CALL RF I N TR< U t CQ . t NO. I . RR INI EC. 42 . C . C. 0 . 0 . C , 0 . C, 0 > 

CC3TC oinaoco if (IRC(I>.LT c> call CNKIOI UICO. THO. l .e.o.o. taco 

CC413 0IF33CC0 eo TO cs 

C04t« 0IFJ4CCC SS NINCLR ■ -RINCLN 

00420 Ct73SC0a |F (NINCLN.GT ICO OR NINCLN LT.a) NTNCLN ■ IS 

C043C 9I73SOCO SEEN • TRUE 

C04ia CI 737CCC URirE(PRIHTlT.C4 ) RINCLN 

coos; 91730009 CALL RP I NTR< U T CO . I NO. I . RH I NTEC . 32 . 0 . 0 . 0 . 0 . C . 9 , 0. 0 } 

CCSOI 01739009 IF IND(l.' LT 9) CALL C HF 1 0( U ICR . INP , I , D . 9 , O . I 3D0 > 

DPSaC OlFOCOPO FOrnSTi • exactly' , i t, ■ clusters sought ■ > 

cosas (>1791000 SS FORRATv' RlHinuN NUrOER OF CLUSTERS tOUCHTl'.IS) 

oosas ciFtaoco cs hinclr • rincln-i 

00317 0I74300C C RAXCLR • 90 tRAXCLN NON A RROGRAR RARaR** •••••*•• •• 

00S!17 CI744C0C C 

C0S17 9I74SC4C C RRINT STARMRC CONOITIONS 

ccsar CI74S00C VRtTE<rRINT|T.I09> HRS. NR 

cessa CI7470CO call RRIHTRIUICQ.IND. I.RRIMTEO. 47.9. C. 9.0.0. C.C.O 

ACCCa 0I74000C IF (INO(l> LT 0) CALL C NM 0( U It R . I HD . 1 . 9 . 9 , 0 . I 490 ) 

coca? CI74SSSC 199 POPRAK - «TNRt UTIH'.M. t L0« T£ RI , ' . • 3 . ' TEST R01RT8 '> 

sssas ciRssccc c 


I 
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• »«atr ti7sioo« c cLCAii « couitrcR or thc Nviiict or riiirt « ciwirii it hit 

• ••19 •ir9IOO« C •¥ H TttT MHtL IH THE IHITIHL CLHlt t FI CHTIOH . 

••«I9 tirtlO^O 00 1 I • I.HF9 

•••la •irs 49 «^ a coohm i> > • 

•••!< •IF99000 C 

•etif oirtcopo c clhisify «no couht the CLHStiricHfioH or cvtor s-th test fixel. 
•o< 9 ( oiTsroto c 

•cca« •irjtooo c vfttioH II otcs hi trHtriHc ciotrcot phtch cehtem emh phtcrcs 

•••i( •irsoooo c corthiriho 9 ot note pixels, this heohs the tihe spert ih hohclu 

••cat •iTtoopo c IS e(RP**i). 

•««a* •irti»«« c 

• 0(11 oirtaoRo to 1 1 ■ i.hp.s 

•C<«; •IT«30«0 CHLL UHCLEI TtPXK I . I I.REHH.HO.HPS. II .HUH. It) 

•C««l «!FS4»04 COOHTCII) ■COUHTCII)*! 

• C(«9 OITOPOO CLHtt(l) > II 

•0<r» Oirtttt* COLL UHCLEITtPXUI . M4l.nCHH.H0.HP9, tl.HHH. I»> 

•cror oiFtFPOo cooHum • coonTcii>«i 

• OPIJ 0IT«0«4* I CLHtt(M4> ■ II 

•erai oirtsooo c 

•ern 0IFF0049 C LiriHC II THE TCHPOOHOT ROHOER OP LIVIHC CLOtTERt 
•«*ai oirriooo Llvinc ■ hps 

• 473 ! •irraoti 00 1 1 • i.rpo 

• CMC •IFFJ0C9 IP CtOUHMI ).CT.«> «0 TO I 

•CF4I VITT4090 C 

•0T4t Oir/3009 C ELtniHHTC • CLVtTCR TO HHICH HOTHIHC It HttICHCO. 

• •P4I •imOtO HUR(I) • • 

•CF44 0IT7T99* LIVIHC ■ LIVIM6-I 

••749 OIF7t«4« I COnriHUE 

• CMI •I779900 C 

• 074( 0I790004 c REPORT NOH RHHir CLVSTERI HE tTHRT HI TN 

• •744 0I79I940 *RI TEI PR IHT 1 1 , 141 ) LIVIHC 

••7«r OI7ia4t« 141 FORRHIC '.14.' CLOtTERt HOVE HOH VOlO Hit IGHRCHTt . ' > 

• 47I7 0|74t440 CHLL PR INTPC 0 1 CO . I NO . I . PRI HTEO. 4 J , • . 4. • . • . • , 4. • . 4 > 

• 1417 41 704444 IF C IND( I > . L T . • ) CHLL CHKI OC U ICO • IHO . I . •. •. • . 1 314 ' 

4144a 4I7I3444 C 

41441 4|7t«444 C OE HRE HINIHC FOR HO HORE THPH !•• TO tHVC TIRE IH THE OHO LOGIC 

41441 41717444 IF X L I V INC LE 1 44 > GO TO 7 

41494 4I7II044 00 4 tC • 1.140 

41499 41 733444 00 3 I ■ I.HP9 

•lOi: (1730044 IF (COUNT(I).NE IE> CO TO 3 

41417 01731444 C 

•1417 41732404 C CLINtNINHTC HR UNPOFOLHR CLHtt HHO . . 

41017 0I73J404 LIVING > IIVIN6*I 

C IC7( 41 734494 NUUC I > • 4 

4IC7Z 41733444 C 

41471 01734004 C . RECLHSIIFV PIXELS *N INHT CLHSt 

4I47I CI737044 00 4 4 • I.NP.3 

41144 OI73t444 IF ( Cl *3 l( 2 > . NE . I ) CO TO 94 

• II4! 41733444 C4LL UNC L E( TS FXL < I . 4 > . NEHN . NO . RP9 , 1 1 , NUN. 10 > 

• II2J OI344440 CLHSKJ) • II 

• liat 4ia»l440 COONTdtl • COUNTCIIMI 

OMTa 01002440 34 IF (ClAS3(a*4) HE. I) C9 TO 4 

411*1 (1841400 >:3|.L UHCLFt TSFXL ( I . 4*4 > . NEHN ■ NO.NPS. I I . non. I 0> 

41114 (IIOIOOO CL4S$(J«4> . II 

4IM9 CIVOSOO^ CC'.'NKIl' • <;ouxT.I|>*| 

• 1 171 41144444 4 COnMNUE 

•1172 4II47444 C 


r 
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r*M •«SS HUNCLU 


• tir2 AllMO^O C tie ir Itc «ltc MNItHCO. 

•lira eit»9**« if CLiviNC.Lt. i«o) co tq r 

oiirr »itioooo 9 contihuc 

•laoo 4 COHTINUC 

•1341 0ltl3»04 C 

•ia«l •lltlO** r CALL COLAPSCHFS, ACAN>NUA.NP> 

4iai4 •ItMAO IF (AINCLN.bf .LIVING) GO TO OOT 

012)9 01019000 C 

01219 oloioooo c (orinorc the oianctea of tnc otaatihg set of clustca centeas 

01219 0I017000 CALL 0 I AN TA( RCAN • ND . L I V I NC . I D l«H > 

01221 OlOliOOO C 

01221 01019000 C FAINT TNC OIAHCTCA. 

01221 01020000 IPAIHT • 101 AR4]2T(F* 1 

0I22F 01021000 IF <1FAINT.LT 4> IFAINT • 12FGT 

01214 01022000 VAtTCCFAlNTlT.OI > IFAINT 

01299 01021000 01 FOAnATC' OOUAIC of OIANCTEA OF STAATING CLUt TEAS > ' . 1 4 > 

01299 01024000 CALL FA |HTF( II ICO . INO. I . F Al NTCO. 91 . 0. 0. 0. 0 . 0 . 0. 0. 0 ) 

01209 01029000 IP CINOCD.LT.O) CALL CNAtOCWICO. INO, 1.0. 0.0. 1944) 

OHIO 01024000 C 

OHIO 0102FOO0 C OCT THACOHOLO FOA CLININATION FAOTCCTION 

01210 01020000 lOIAHF ■ IFAINT/NINCLN/4-32I40 

0IS49 01029000 C 

01249 OlOSOOOO C INITIALI2C A OURNT VCCTOA OP FOINTCAS TO CLIAINATCO CLUtTEAS 
01149 OlOllOeO 00 0 I • I. LIVING 

01292 01012000 0 OUH(l> ■ I 

01234 01012000 C 

01194 01024000 C FACFAFC TO COUNT CAAOAO 

01294 01019000 IN > LIVING 

01340 01014000 RIH ■ 12747 

01142 01027000 C 

OlKl 01010000 C HFA !9 1/4 OF TNC NAT ON ONE SIDE OP TNC POINT UC AAE NOAAING ON 
01142 01019000 NFA - NF/4 

01149 01040000 C 

01249 01041000 C NFA2 IS 1/2 IR TNC OADCA INFLtCO OV TNF ACSULT OF SOFT. 

01149 01242000 NFA2 • NFA42 

OHIO OI041OOO C 

oiiro 01444000 c ClAISIPt all test POINTS FAIOA TO GETTING OTAATCO. 

01170 OIA49000 00 9 I ■ I.NF 

01279 0IS44000 C ALL UNCLE! TOP AL( I . I ) , HE AN , NO , L I V INC . 1 1 . OUR . ID ) 

01411 0104*000 9 CLASt<I) ■ II 

01417 01040000 C»**»4»*»o»*4**»4*»* »•♦•*••♦•**♦••****•♦»•••♦•••••»*•*•••••• ••••*»• 

OI4I7 0IS4900O C 

01417 01A90000 C ACPCACNCC POINT FOR NAIN LOOP. 

01417 01091000 C 

01417 01092000 C INITIALIZE LOCAL COUNT OF CAAORO 

01417 01991000 11 NCAA • 0 

01421 01094000 00 40 I ■ I. LIVING 

01424 01999000 40 CRAORl I > • 0 

01431 0IA94O00 C 

01412 01497000 C GO TNFU 7ATCNCS (I C.TNAU TEST PIXELS IN SETS OF 9) 

01412 01090000 00 29 I « I.NF. 9 

01417 O109SO0O C 

01417 0IS40009 C GAAO » TEST SET OF 9 FAOA THE SANE PATCH 

01417 01041000 11 - CLASS! I) 

01441 OII42VOV 12 > CLASS! HI) 

01447 0IV«1000 IS • CLASS! I«2> 

01494 01044000 14 • CLASS! I«l> 


m 
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i 

f 

f 


f«CC **94 IWMLW 


•1441 41149444 19 - CL*94(I*4> 

41444 41*44444 C 

4M44 CI«irOO« C 44rc AN IMOCX IN10 TMC TE9T MXfLt 4HICH 4H4UL4 4C F«4 IH0464 
4144* OIt*4444 C AMAT tO tC IN A AIFFCACHf REAL CLA44 <AirN9»6N TNI* CANNtT 4E 

4I44* 41449444 C GUARANTfEO. THEN CHCCE If If 14 OFF TNE ARRAT . 

414*4 41114444 IT • t«HPA 

4I4FI OI*ri444 IF (ITCTHFI IT ■ I-NFA2 

414FF 4I9F1444 C - - - 

4I4FF 4IIFS494 t 

414FF 41414444 C THM 4E6AENT COUNT! TNE NNNRER OF TIRES A CLUSTER ATTRACTS A 

41411 41919444 C PIXEL F33n ONE OF TNE TEST SETS ANO FROA ONE IN THE FAR AUAV 

4I4F1 41914444 C CLASS bi.LT ONE CLUSTER IS INVOLVEO IN THIS TTPC OF ERROR. 

41411 41911444 00 1* 4 • 1.2 

41944 41910444 IC ■ CLA94(tT> 

41941 41919444 IF <IC.NE.tl> 60 TO II 

41924 4I44-444 CRIOPdt) ■ ERRORL 1 1 >*l 

41924 414.1444 NEAR « NERR*t 

41929 OI44244« II IF (IC NE.I2) 60 TO 12 

41911 1:949444 ERR0R<I2> ■ ERRORdsm 

41999 41«I4444 NEAR > NCRR*I 

4I9TL 41499444 12 IF dC.NE.l3) GO TO 19 

41942 4199*000 EC*0Rd9) « CRR0ed9)*l 

41941 01991000 NERR • NERR41 

41941 41900444 13 IF (IC.NE.I4) 60 TO 14 

41993 4109*444 ERROR! 14) ■ ERR0Rd4)*l 

41991 419*4444 NCR* ■ NCR**! 

41940 41*91444 14 IF dC RE. 19) GO TO 19 

0i;i4 41192000 ERROR! 19) > ERR0R!I9)*I 

0191! 01499004 NERR ■ HERR*l 

41911 014*4044 C 

01911 414*9404 C HARE ANOTHER INDEX FAR AUAT . 

4IS1I 419**444 19 IT • l-NPA 

41914 0I991444 IF dT.LE.4) IT ■ l*HPA2 

41402 01**9044 1* CONTINUE 

01*03 419**444 C- - - 

41*43 01*44444 C 

41*41 41*41044 C 

41*09 01*42044 C 

41441 01T43444 C THIS SECHENT COUNTS THE NUHOER OF SANE PATCH •• OIFFERENI 

41403 01944444 C CLUSTER ERRORS EACH ERROR IS CREOITED TO TOO CLUSTERS 

41*03 41*094*0 C SINCE UE CAN OE CONFIDENT THAT SANE PATCH SANPLEt ARE 

*1*4! 0I99C'** C IRON THE SANE REAL CLASS. 

41401 OltdvOO IF dl.E0.l2> GO TO 42 

*1401 0<*09000 NEFF • NERR*2 

41*1 1 4199*940 ERRORlIl) ■ ERRORdl)*! 

*1*19 01*14040 ERRORdl) ■ ERR0R!I2)«I 

01*21 01*1194* 43 IF dI.EO 13: 60 TO 44 

*1*33 01*13444 NERR • NERR*3 

01*31 01*13044 ERROR! II) ■ ERRORdl)*) 

01*33 01*14440 CRP0*!I3> • CRR0*!I3)*) 

4t*'31 41*190*0 44 IF dl.E0.l4) 60 TO 4* 

41(43 01*1*440 NERR • NERR*3 

91*4*. 01*11440 ERROR! II) ■ ERROR! It)*! 

41**1 91*1*4*4 ERF0R1I4) • ERROR! I4)*l 

*1*33 o:*l*4*4 4* IF (II EO 19) CO TO 40 

*1*41 01*2*4*4 HER* • HEAR*! 

*1*43 01*310*4 ERfOt!ll) ■ ERROR(|i;*l 
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»u«r 

• i«ri 

• i«rr 

fir«i 

oir*s 

•tm 

•ms 

•mr 

• irti 
•mr 

• 1711 

• 1713 

• 1741 

• 1743 

• 1731 

• 1733 

• 1737 

• I7tl 

• i7»r 

• 1771 

• 1773 

• 2*31 

• >••3 

• 2C47 
«7«tl 

• 2«I7 
42917 

• 2317 

• 2»17 

• 2«17 

• 2*24 

•ao2« 

42C24 

• 2427 

• 2*27 

• 23S7 

• 2»il 

• 23SI 

• 2311 

• 2311 

• 2»ll 

• 23I4 

• 2«4| 

• 2343 

• 2047 

• 2447 

• 2447 

• 13S4 

• 2441 

• 14(7 

• 2«72 

• 2 «r« 

• 2973 

• 247; 

• 2473 

• 210» 
• 21 »« 


•iftao^* 

•ini44* 

•19t4444 

•1413440 

01924440 

•1427«44 

•lSat444 

•1«S«0«4 
•14I1444 
•I4l20«4 
•14I1444 
•t9]4444 
•1«1S44« 
•14lt»4« 
•19I7444 
•tflt44« 
•ISlfOO^ 
•1940004 
•(••tOOO 
«l I42009 
•1941909 
•tt44«00 
•IVOSOO* 
•l•4•••• 

•lf47»09 

•(•••••• 

•1944004 

01934040 

01931049 

•1932040 

01*3104» 

0tf3409» 

01933904 

61«S«»44 

•1937»44 

•ItSSOO* 

•1939494 

OlftOOOO 

•IftlOOV 

•l•t2»•• 

•t9tl»44 

•1944990 

01««S0«« 

• mt 04 » 

•lf4S»0* 

•!•••••• 

•If79*04 

•1*7104» 

•If7a09« 

• lt7]»0« 
•lf7444» 
•197I049 
0lf7«*99 
01977004 
0lf790*0 


tSSOK 13) ■ IStOjK I3)«l 
41 17 <12.(1. ID SO TO 34 
MCtI ■ Mltlf2 
(ItOK Il> ■ C7C0IU I2)«l 

t*roii( n > • f4«0R( ID4I 
36 17 (12 (0.14) 60 TO 32 
KM > KM42 
IM0R( 12) ■ (OlOK |2>*l 
(MOK 14 ) • CRItOK 14)41 
32 17 ( 13 (0.19) 60 TO 94 
NCM ■ NCM*2 
(MOK II ) • («R0I( 13)41 
(*70R( 13 > ■ CMOK 19)41 
V4 17 (II. to. 14) 60 TO 96 
KM • MCM«2 
M'OK It > - («R0«( l])*l 
CMOK 14 > ■ (M0«( 14)41 
96 17 (13.10.19) 60 TO 90 
N(77 ■ MCM42 
(eSORt II > ■ MR0R( II)4| 

CRIORt 19) • IRROHC I9)4| 

90 17 ( 14.(0.19) 60 TO 29 
N(M - NMR42 
(R70R( 14 > ■ (RR0R( 14)41 
(Rf 07(13) ■ (RR0R( 19)41 
C 

e RLl. Cn-'-lTIO 70R TRI* 7RTCR 

C 

C 6(T TH( N(XT RRTCN 
29 C0MT1NU( 

C 

C S(( 17 CRRORI ••( L(6S TNON 0(70R(. 

17 (MRI 6( HIM OR Ifl.K.RRXCLR) 60 TO 192 
C 

C TM .ORORTr RUNMIR6 RIR 
RIN ■ NMR 
H|R ■ IN 

6 

C TNK 3(6N(NT CRRRK6 OUT TN( L06IC OICTRIM OV TNI N(C0 F< R 
C (RACUr 10 RANT CLOtTERI. (IT 406Y CR(V THIS NRV.) 

1*2 IN > IH-I 

17 <0E(K> HIN ■ lN*t 
17 ( IN L(.NIHCLN) 60 TO 101 
N7 • -12767 
C 

C 7IN0 TH( CLUDER MNICN S((NS TO CAUSC TN( HOtT (*R0RS 
00 41 J • 1 .LIVIN6 
17 (0UN(4>.(1.»> 60 TO 41 
17 (ERROR! J > L(. HR ) 60 TO 41 
Hf ■ CRR0R(4> 
tH • t 
41 CONTINUE 
C 

C STAC7 THE INDEX 07 THIS ROOT 077ENSIVE CRITTER 
t A*E( IN > ' N I 
C 

t RNO NARR IT OecK.. (17 NOT DEA»> 
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ISS» lA ■ IA*I 

SARCRJ > > I 
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Parent: CLASSIFY PERPIXEL 

Calls : REOECTH 

PERPIXEL(PIXELS.CLUSTERS,LABELS.ND,NC,NFCLUS.REJECT,MAXCLUS.MASIC.NX) 

Performs a per pixel classification of a line. 

Method : This subroutine performs a checked per pixel classification 
of multidimensional data in PIXELS to classes in CLUSTERS. Nearest 
neighbor (Euclidean distance) classification is en^)loyed. However, when 
the distance is too great, the classification is rejected, and (generally) 
a new cluster center is added. On that event, the cluster reject thres- 
holds are recomputed and this line classification is restarted. Labels 
(i.e. class nuirt>ers) are written in line LABELS. 

The program is straightforward with the possible exception of the 
biased distances and the modification of rejection thresholds in loop 120. 
All distances are biased by -32768; the rejection thresholds are also 
biased by -32768 in REJECTH. We wish to make the test less severe than 
was applied to pure pixels (the test pixels), so we logically multiply 
the thresholds by 2 (which, in terms of distances, amounts to deciding 
reject on 0.7 ♦ distance between competing classes rather than 0.5 * 
distance between). Thus, if r is the biased and R the logical threshold, 
we have r = R - 32768, so that 

r' = 2R - 32768 = 2(R+32678) - 32768 
= 2R + 32768 . 

Progra m Variables 

AO INTEGERM Long integer to carry out the threshold 

arithmetic intermediate steps. 

CLUSTERS ( NO, MAXCLUS) INTEGER ARRAY The clusters. 

I,J,K INTEGER 00 loop index. 

ICL INTEGER Class number of nearest cluster. 
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PERPIXRL-2 


IM 

IS 

IT 

JR 

LABELS(NC) 

MASK 

MAXCLUS 

NC 

ND 

NFCLUS 


PIXELS(NC,ND) 
REJECT (MAXCLUS) 
REJECTH 


INTEGER Distance to nearest cluster 

INTEGER Accumulator for computing distance. 

INTEGER Temporary for computing distance. 

INTEGER Reject threshold of current class. 

INTEGER ARRAY The classification. 

LOGICAL If .TRUE., a value 0 in channel 1 Is used 
as a mask (I.e. not data), and the label 99 Is 
stored In LABELS. 

INTEGER Maximum number of clusters (current value, 
set in MAIN, Is 98). 

INTEGER Number of samples per line. 

INTEGER Dimensionality. 

INTEGER Current number of clusters. 

INTEGER Actual number of samples per line (approximately 
NC, but often a little less). NC Is used to dimension 
things; NX may vary from one call to the next. 

INTEGER ARRAY One line of data. 

INTEGER ARRAY The reject thresholds. 

SUBROUTINE Calculates REJECT. 
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99917 00334900 IP C NOT RASR « CO TO SO 

9C02: 09333900 IF 1 PI XELSC I . I > HE . 0 ) 60 TO SO 
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0CI3S 00344000 IF (ICL.CO O' CO TO 109 
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Parents ; MOREQUES. PERPIXEL REJECTH 

REJECTH ( CLUSTERS , ND . REJECT . NFCLUS ,MAXCLUS ) 

This subroutine determines the cluster-dependent classification rejection 
thresholds. 

Method: Using a bias of -32768, the inter-cluster squared distance from 
cluster I tc cluster J is maximized. The maximum over J, divided by four, 
is the classification rejection threshold for cluster I, and is stored in 
REJECT(I). To minimize round-off error, the division by four is carried 
out in two stages; the second accounts for the bias: R = D - 32768, so 

D/2 - 32768 = R/2 - 16384 should be stored as the twice halved square of 
distance. 


Program Variables 
CLUSTERS(ND,MAXCLUS) 

I.J.K, 

IM,IS,IT 

MAXCLUS 

ND 

NFCLUS 

REJECT(MAXCLUS) 


Integer ARRAY The cluster centers. 

INTEGER DO loop index. 

INTEGER Used in calculation and maximization of 
distances. 

INTEGER Maximum number of clusters. 

INTEGER Dimensionality 

INTEGER Actual number of clusters. 

INTEGER ARRAY Rejection thresholds. 
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17 ( ll.ie.tfl) 60 TO 20 
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Parent: MAIN SETSYM 

SETSYM( SYMBOL) 

Sets SYf©0L to the appropriate character set for display In MAPP. 


Method ; Because of the restrictions of HP FORTRAN- SEGMENTER and of 
IDIMS, this usually straightforward task Is 14 lines long. The result 
Is to set SYMBOL(l) equal to a blank and SYMB0L(2) to SYMB0L(59) 
(printable) ASCII characters between 33 and 90 (decimal). 


Program Variables 
CC(2) 

IC 

J 

NOX 

SYMBOL (59) 


CHARACTER*! ARRAY Equivalenced to IC. 

INTEGER Used to transfer ASCII binary to characters. 
INTEGER DO loop Index. 

INTEGER Index Into SYMBOL. 

CHARACTER*! ARRAY The symbol set. 
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Parent: MSORT, SORT 


SHELL 


SHELl( KEY, INDEX. NUMBER) 

Sorts KEY into increasing order; the elements of Kf'' are not interchanged. 
Rather, the array INDEX is permuted so that KEY(IND»;y(I)) < KEY(INDEX(J)) 
when I < J. 

Method : See Shell, D. L., A high speed sorting program. Comm. ACM 2 (lSo9), 
30-32. 

Program Variables 

INDEX(NUMBER) INTEGER ARRAY On entry, INOEX(I) = I; on exit. 

INDEX is permuted. 

KEY (NUMBER) INTEGER ARRAY The list to be sorted. 

NUMBER INTEGER The number of items in the list. 

I.IM,IT.J,K,M 


INTEGER Internal variables 
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Parent : MAIN SORT 

Calls : SHELL 

SORT( TSPXL . DUMMY , I NDEX .ND ,NP .NPS ) 

Sorts TSPXL 1n Increasing order of the sum of odd channels of the first 
test pixel In each test set. Sets are kept together. 

Method : First the sums of odd bands of the first test pixel In each test 
set are accumulated, and an Index Into the sets Is formed so that INDEX(I) 
s I. Then SHELL Is called. On return from SHElL, the test sets are 
reordered by the permutation of INDEX Induced by SHELL. The actual sets are 
now switched In place. 


Program Variables 

DUMMY(NPS) INTEGER ARRAY Used to accumulate sums In odd bands of 

the first element of each test set, and then as a 
temporary vector while TSPXL Is being reordered. 


I INTEGER DO loop Index. 

INDEX(NP5) INTEGER ARRAY The pointer array, used by SHELL to 

Indicate order of DUMMY. 


J 

K 

L 


ND 


NP 

NPS 

SHELL 

TSPXL (ND,NP) 


INTEGER DO looD Index 
INTEGER DO loop Index. 

INTEGER DO loop Index; note: the Index Into TSPXL Is 
I * 5 + L - 5, where I Is the test set number and L Is 
the number, J = 1,...,NP5, L * 1,...,5. 


INTEGER Dimensionality of TSPXL. 

INTEGER Number of test pixels. 

INTEGER Number of tost sets (« NP/5). 

SUBROUTINE Sorts vector In Increasing order. 

INTEGER ARRAY Test pixels, organized as vectors, then 
sets, then count of sets, to be reordered by Increasing 
sum of odd bands. 
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Parent : MAIN START 

Calls : MARKLP, MARKKUPDN. MRKIUL, CONNCT 

START(INnHR,ND,NR.NC.NZ.DATBUF,LABBUF.ISCAN,UICB.IND.IMGIN,LMGCLAS.LAB, 

MASK) 

START makes the vector boundary decisions and returns a map (on disk) of 
the components of the complement of the boundary. 

Method : The program employs Wide Image Logic to se^nent the image into 
strips. Three lines of data and three lines of labels are active at any 
given time. Circular Buffers using pointers II, 12, and 13, manage this; 

II always points to the eldest and 13 to the newest. Even so, wa can't 
phrase the description of the method in absolute terms (i.e. as if the 
entire image were in memory at once). The initial phase must be described 
first. 

Initially, a labels buffer is set to all one. Then the data is 
scanned, and points for which the vector gradient test is failed in the 
left-right direction are marked (subroutine MARKER). Then up-down boundaries 
are located (subroutine MARKUPDN), and one-pixel gaps in the bou.idary map 
are filled (subroutine FILER). Then, the logical OR of the first and second 
line replaces the first. (These at'e details, but. this is detailed documen- 
tation, and FORTRAN loves it.) We now have an excellent estimate of the 
boundary in the first two lines and a fair first cut on the third. Sub- 
routine MRKIVL is called to mark intervals in line 1. MRKIVL replaces 
each interval contained in the complement of the boundary along a line with 
(successively) 1,2,... . If any patch slices were found in this step, 
then the first line is initialized with patch labels (starting at -32767). 

Now the big loop is entered. Intervals in line 12 are marked. If 
any patches are found here, subroutine CONNCT is used to transfer old labels 
to new intervals to begin new labels. Row II labels are written and new 
data is read into DATBUF(. ,. ,11 ) . New labels are marked with 1 and the 
circular buffer pointers are rotated. (Now 13 is the newest). We now 
cal'. .MARKER (13) and then MARKUPDN, and return to the big loop. Exit 
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START- 2 


from the loop when we run out of data (or labels). If out of data (or 
at the bottom of a strip), process lines 12 then 13 and write on disk. 

If out of labels, simply paint the rest of the image boundary. (Since 
64K labels are allowed, this seems unlikely to happen on even the largest 
natural irrages. ) 

The initialization phase guarantees labels are not propogated from 
the bottom of one strip to the top of the next. 


Program Variables 

CHKIO SYSTEM SUBROUTINE 

CONNCT SUBROUTINE Transfers labels from one line to the next, 

and begins new labels when no connection exists. 

DATBUF(NC,ND,3) INTEGER ARRAY Three lines of data, organized as a 
circular buffer. 

FILLLR SUBROUTINE Replaces a boundary-not boundary-boundary 

gap by three boundary points along a lint. 

FINISHED LOGICAL Returned .TRUE, by CONNCT when no more labels 

exist. 


I, ITM,J,K 

II, 12,13 
IMGCLAS 

IMGIN 

IN 

IND(l) 

INTTHR(ND) 

IRON 

IROWOUT 

ISCAN(l) 


INTEGER DO loop index. 

INTEGER Circular buffer pointers 

INTEGER Image number of the (temporary) label s-boundary 
disk maps. 

INTEGER Image number of the data. 

INTEGER The number of intervals found by MRKIVL in a line. 
INTEGER ARRAY Error indicator 

INTEGER ARRAY The vector thresholds obtained by THRFND. 
INTEGER Line number being read. 

INTEGER Line of labels being written 

INTEGER ARRAY Scratch array used by CONNCT to store labels. 
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IT 

LAB 

LABBUF(NC,3) 

MARKLR 

MARKUPDN 

HASK 

MRKIVL 

NC 

ND 

NW,NX.NY 

NZ 

READP 

UICB(l) 

WRITEP 

Z 


START-3 


INTEGER Used to rotate pointers. 

INTEGER Current label number. 

INTEGER ARRAY Three lines of labels. 

SUBROUTINE Mark left-right boundaries depending on 
vector thresholds. 

SUBROUTINE Mark up-down boundaries. 

LOGICAL The Mask flag. 

SUBROUTINE Mark intervals along a line in the complement 
of the boundary. 

INTEGER Number of columns in a strip. 

INTEGER Dimensionality 

INTEGER Used to partition the image into strips (see 
Wide Image Logic). 

INTEGER Actual number of samples. 

SYSTEM SUBROUTINE 

INTEGER ARRAY User Information Control Block 

SYSTEM SUBROUTINE 

INTEGER -32768; boundary marker. 
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SIRFLC VAR 

• 424 


ISCAR 

IRTCCCR 

ARRAY 

0-212 .1 

IT 

IRTCCCR 

SIRFLC VAC 

• 4212 


ITR 

IRTCGCR 

tlRRlC VAR 

• 423 

i 

IRTCGCR 

SIRFLE VAR 

• 4210 


K 

IRTCGCR 

SIRFLC VAR 

• 4221 

LAO 

IRTCGCR 

SIRFLC VAR 

0-23 . 

1 

lASOOF 

IRTCCCR 

ARRAY 

• -2I3 . I 

RARRLR 


SUOROUTIRC 



RARRUARR 


SUOROUTIRC 


RASR 

LOGICAL 

SIRFLC VAR 

0*24 

I 

RRRI3L 


IVOROOTIRC 


RC 

IRTCGCR 

SIRFLC VAC 

• -21* . 

1 

RC 

IRTCCCR 

SIRFLC VAR 

• -220 . I 

RR 

IRTCGCR 

SIRFLC VAR 

0-217 . 

I 

RO 

IRTCGCR 

SIRFLC VAR 

• 4211 

RX 

IRTCCCR 

SIRFLC VAR 

• 4214 


RY 

IRTCCCR 

SIRFLC VAR 

• 4213 

HZ 

IRTCGCR 

SIRFLC VAR 

•-2IS . 

I 

RCAOP 


SUOROUTIRC 


START 


SUOROUTIRC 



• ICO 

IRTCGCR 

ARRAY 

•-2II .1 

ORI TCF 


SUOROUTIRC 



C 

IRTCGCR 

SIRFLC VAR 

• 4222 
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Parent : MAIN THINTSTM 

Calls : GETN25 

THINTSTM{MP,TSP.nP. CLASS, C0UNT,N25,N60,N288,N140,N388,N428,ND.NTSI,FILEN0. 
UICB,IND) 

Subroutine ASELECT creates a disk file of samples taken 5 at a time from 
the patches (the components of the complement of the bouncary). There will 
be many of these sets in a large image. The starting conditions for the 
clustering part of AWEBA require relatively few. Further, the starting 
cluster centers have not been formed yet. Subroutine THINTSTM forms the 
starting cluster centers and "thins" the test sets. 

Method : In MAIN, program variables N2BB, N38B, and N42B are selected as 

large as possible depending on available menory (i.e. depending on ND). 

(If ND is 4, then N2BB = 2BB, N3BB = 38B, and N42B = 428). If we start 
with NTS test sets (stored on disk by ASELECT), and if NTS < N428, then 
THINTSTM simply forms NTS means as starting cluster centers and returns 
N140 = NTS and N488 = NTS. Otherwise there are many test sets, and a 
complex procedure is followed to prevent the number of means and the 
number of test sets from growing too large. 

1. If NTS < N25, go to step 7 (the finish). Otherwise read 25 test 
sets into the temporary buffer TTP. 

2. Each test set has 5 test points; classify the 25 first elements in 
the class of nearest last element centers. 

3. For each of the test sets classified correctly (i.e. in which the 
first element was nearer its last then the last of another of the 25), 
form the mean and add this mean to the mean pool MP, indicate the 
main pool is occupied with COUNT, and add the test set to the test 
set pool TSP, indicating it is occupied by CLASS. Count these 
events in NMP (the means) and NTSP (the test sets). 

4. If there are more than 100 vectors in the mean pool, proceed to 
step 5. If there are more than N388 test sets, proceed to step 6. 
Otherwise go to step 1 and get more. 

PRECEDING PAGE BLANK NOT FILMED 
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5. More than 100 vectors are in the mean pool. Classify all center 
test pixels In the classes of the mean pool and count the number 

of times a mean Is hit. Eliminate each mean which was not hit. If 
NMP Is greater than N60 (*60) after this operation, eliminate each 
with 1, 2, ... classification until NMP s N60. At each elimination, 
reclassify center test pixels which were assigned to an eliminated 
mean and test If NMP < N60. When N60 Is reached, test whether NTSP 
Is greater than N388. If NTSP s N388, return to step 1 to gather 
more, else proceed with step 6. 

6. More than N388 test sets art In the test set pool , but fewer than 
100 means are In the mean pool. Moreover, the majority of the test 
sets have been assigned to a class by the logic of step 5. (The 
ones just added haven't been, of course, so they are kept around a 
while longer.) Determine the mean which has the most test sets 
assigned. Test sets assigned to this mean are duplicates. Eliminate 
one (the first one found) and decrease the count. Repeat until 
fewer than N288 test sets are present, then go to step 1. 

7. Close the disk file which contained the test sets, and compress 
and count the means and test sets; N140 Is the number of means 
(starting cluster centers) and N428 Is returned the number of test 
sets. 


Program Variables 

CLASS(N428) INTEGER ARRAY The class to which a center test pixel 

Is assigned. 

C0UNT(N140) INTEGER ARRAY The number of center test pixels assigned 

to a mean. 


FCLOSE 
FI LEND 

FIRST 


SYSTEM SUBROUTINE 

INTEGER File number of scratch disk file containing 
test sets. 

LOGICAL Switch used to jump around rewind of file in 
GETN25. 
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GETN25 

SUBROUTINE Reads N25 test sets from disk. 


INTEGER DO loop index. 

lA 

INTEGER Class of nearest last test set element to first 
in batch of N25. 

I CL 

INTEGER Class of nearest mean to test set center. 

IE 

INTEGER Count of how unp(H>u1ar a mean should be to be 
eliminated. 

IM 

INTEGER Distance to mean of a test set center. 

IND(1) 

INTEGER ARRAY Error indicator. 

IS 

INTEGER Distance accumulator. 

IT 

INTEGER Used to convert logical to integer. Also 
distance temporary. 

m 

INTEGER Distance temporary. 

JS 

INTEGER The test set center reclassified after a mean 
is eliminated. 

OT 

INTEGER A classification, tested to see of a duplicate 
test set has been found. 

LIT 

LOGICAL Equivalenced to IT; used to convert logical to 
integer to compensate for the inadequacy of the Segmenter. 

MAX 

INTEGER The running maximum count of classification 
in all means. 

MP(ND,N140) 

INTEGER ARRAY The mean pool » and, on return, the means. 

Ni40 

INTEGER On call, 140; on return, the number starting 
clusters. 

N25 

INTEGER The number of test sets read at a time. 

N288 

INTEGER Number of test sets sought after the Iteration 
is started. 


N388 


INTEGER Test set iteration trigger. 
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N428 

INTEGER 

Number of slots for test sets. 

N60 

INTEGER 

Means sought after iteration started. 

NO 

INTEGER 

Dimensionality. 

NOS 

INTEGER 

ND*5. for GETN25. 

NDXM 

INTEGER Pointer to mean pool, used In search for a 
free slot. 

NDXP 

INTEGER 

Pointer to test set pool. 

NEAR 

INTEGER Used in test set to test set smallest distance 
determination. 

NMP 

INTEGER 

Number in mean pool. 

NTS 

INTEGER 

Number of test sets on disk remaining. 

NTS I 

INTEGER 

Number of test sets on disk. 

NTSP 

INTEGER 

Number in test set pool. 

TSP(ND.5,N488) 

INTEGER 

ARRAY Test set pool. 

TTP(ND.5,N25) 

LOGICAL 

ARRAY Test sets for GETN25 to read into. 

uiccd) 

INTEGER 

ARRAY User Information Control Block. 
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Parent: MAIN THRFND 

THRFND(NFL,INTTHR, SCANLINE, UICB.IND.KOUNT, NR, NC,NB,MASK,IMGIN) 

Subroutine THRFND finds vector thresholds INTTHR so that about NFL percent 
of the scene Is in patches. The thresholds are used in subroutine START 
to decide boundary; a boundary decision is made when any channel differs 
from its neighbor by more than that channel's INTTHR. 


Method : Initially the data is sparsely sampled to estimate the vari- 

ability. The initial thresholds are, for each channel K, .15 x (NFL+5) 

X A(K)/*^, where A(<) is the average estimated in that channel of the 
difference of a point from its left-hand neighbor. 

These initial thresholds are updated in an adaptive program which 
scans the data checking when a boundary decision would be made. A count 
is kept of the decisions non boundary (i.e. pure ). When too few pure 
points are being found, the thresholds are increased (making it harder 
to be a boundary point and thus easier to be a pure point). When too 
many are found, the thresholds are decreased. Too many or few is 
decided by comparing the count NFND with the target TARGET. 

Only one threshold is increased or decreased at a time. A count 
is kept of the number of boundary decisions which have been made per 
channel. When it is necessary to decrease the thresholds, that threshold 
which has the highest count is decreased. Dually, when the thresholds 
must be increased, that threshold with the lowest count is incremented. 
The effect is that the algorithm expects all channels to contribute 
equally in boundary-finding. 

Flags UP and DOWN, initially .FALSE., control exit. When the 
thresholds are increased, UP is set .TRUE. When they are decreased, 

DOWN is set .TRUE. Another flag, BOTH (initially false) is tested 
after each outer loop (the data is scanned in a pattern which spreads 
the sparse sample), and then, after the test is set to UP. AND. DOWN. 

Thus one more scan of the data is taken after both flags have been set. 
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The scanning strategy has two phases: the initial phase sets NCS = 
the larger of 4 or NR*NC/557039, Then data is sampled by 'ivery 

eleventh line and every NCS sample along a line to estimate variability. 

When the sum exceeds 32245 in any channel or when the loop falls through, 
the initial INTTHR estimates are calculated and the second sampline phase 
is entered. Let 

NSEL = (NR/100)*(NC/100)/2 + 1 
N20 » 20 ; if N20 > NC, N20 = NC/2 . 

Then the loop structure is the following (in FORTRAN): 

DO 665 NN = 2. N20, NSEL 
DO 65 N = 1,20.5 
IS = N+NN 

IF(IS.GE.NR)IS = NR/2 
00 100 I + IS, NR, 17 

read data for line I 
DO 60 J = NN,NC,5 

count pure decisions and boundary 
decisions per channel. 

60 CONTINUE 

see if too many (-»80), just right (-*100) 
or too few (here) (increase and go to 63) 

80 too many : decrease 

63 reset counters and target 

100 CONTINUE 

65 CONTINUE 

test BOTH ; if .TRUE., RETURN 
set BOTH to UP. AND. DOWN 
665 CONTINUE 
RETURN 

As can be seen, the sample is not sparse: in particular, the 65 loop 

iiKiSt be executed at least twice, first with NN = 2 and then NN = 2+NSEL 
(in a 512x512 image, NSEL = 13). Thus, the rows sampled are (with NSEL = 13) 
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(Xjring the first four, every fifth line sample pair is sampled beginning at 
sample 2. During the second four, the starting sample is 15. This sampling 
strategy avoids staying in any place too long and is fast. 


Program Variables 

BOTH LOGICAL A flag which, when true, means the thresholds 

have been increased and decreased. 


CHKIO 

DELTGT 


DOWN 


FUG 

I,J,K,N,NN 

IMGIN 

IND(l) 

INnHR(ND) 

IS 

IT 

JM 

KI 


SYSTEM SUBROUTINE 

REAL (NFL+5)/100, used to increment TARGET when a test 
is made. 

LOGICAL A flag used to tell when the thresholds have 
been decreased. 

LOGICAL Used to tell when a pure point has been detected. 
INTEGER 00 loop index. 

INTEGER Input image number. 

INTEGER ARRAY Error indicator. 

INTEGER ARRAY The integer thresholds. 

INTEGER Starting row in loop 100. 

INTEGER Used to accumulate initial estimate of variabu ity. 
INTEGER J-1; points to sample to left along a line. 

INTEGER Index of threshold to be increased or decreased. 
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KOUNT(ND) 

MASK 

MAX, MIN 

N20 

NC 

NCS 

ND 

NFL 

NFLD 


NFND 

NR 

NSEL 

NUM 

OONUM 

READP 

SCANLINE(NC.ND) 

TARGET 
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INTEGER ARRAY Used to count the number of times a 
boundary decision would have been made in a channel 
if all were tested. 

LOGICAL If .TRUE., a value of 0 in channel 1 is 
regarded as a mask (i.e. not image data). 

INTEGER Used to find max or min of KOUNT to determine 
which channel threshold to adjust. 

INTEGER Loop 665 paraneter: usually 20. 

INTEGER Number of satrples. 

INTEGER MAX0(NR*NC/557O39,4) used to sparsely sample 
on the initial estimate. 

INTEGER Dimensionality. 

INTEGER Input parameter: percent of scene which user 

believes to be inside patches. 

INTEGER NFL+5; interval parameter which allows for 
"crack" fill in logic in start which adds boundary points 
not based on thresholding. This has been found to amount 
to about 5 percent. 

INTEGER Running number of pure points found; tested 
against TARGET to decide if too many, too few, or about 
right during a pass through the data. 

INTEGER Number of lines. 

INTEGER (NR/100)*(NC/100)/2+l; used as loop 665 parameter. 
INTEGER Counter during variability estimation phase. 

REAL Used to form initial threshold estimates. 

SYSTEM SUBROUTINE 

INTEGER ARRAY One line of data. 

REAL Running count of the target percent pure points. 
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INTEGER ARRAY User Information Control Block 

LOGICAL A flag which Is set when the thresholds have 
been Increased. 
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Parent : NUMCLU UNCLE 

UNCLE(TSPXL.MEAN,ND.MAX,NEAR,SUM.ID) 

This pro&ram finds the cluster from MEAN nearest TSPXL. Only clusters 
with Index I such that SUM(I) f 0 are considered. The distance Is 
returned as ID, the Index as NEAR. 


Method : Self-documenting. 


Program Variables 

ID INTEGER The squared distance {biased by -32768) from 

TSPXL to the nearest cluster In MEAN. 


IS, IT 

K.M 

MAX 


INTEGER Used to compute the distance. 

INTEGER DO loop Index. 

INTEGER The largest possible number of means. Inactive 
means have SUM(.) = 0. 


MEAN (ND, MAX) 
ND 

NEAR 

SUM(MAX) 


TSPXL(ND) 


INTEGER ARRAY The cluster centers. 

INTEGER Dimensionality 

INTEGER Returned Index of nearest cluster to 
TSPXL. 

INTEGER ARRAY An Indicator that a cluster has been 
eliminated. If SUM(I) » 0, then cluster MEAN(.,I) Is 
gone. 

INTEGER ARRAY The point to be classified. 
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APPENDIX A 

THE THEORETICAL FOUNDATION OF AMOEBA 

The Assumptions . The clustering technique AMOEBA Is based on three 
groups of theoretical statements. The first group concerns the relation- 
ship between the spatial and spectral behavior of the data. Roughly 
speaking. It 1s assumed that spectrally homogeneous groups of pixels 
^ound In spatially connected blobs represent the real classes. The 
second group contains a definition of a new concept (the pair probability 
of misclustering) and specifies that In clustering It Is desired to mini- 
mize this probability The third group concerns the problem of handling 
the classification of pixels which are on the spatial boundary between 
two real classes. 

Group A 

A1 Real classes exist and can be distinguished using digital 
multi -Imagery. 

Discussion: While It might not be questionable that real classes exist, 

It Is certainly not clear that this Is the case for their representation 
in multi-image measurements. Assumption A1 may fall In clu tering If too 
much Is hoped for in the Identification of clusters In the data with 
real world classes. On the other hand, the assumption must certainly 
be held, at least Implicitly, by all who would cluster the data looking 
for associations homologous to real classes. 

It can also be observed that digital multi-image data consists of 
pixels at the atomic level. A consequence of assumption A1 Is that, 
at least for some pixels. It Is meaningful to ask what real class a 
pixel belongs to. It Is clearly not possible to ask this of all pixels. 
Pixels on spatial boundaries display erratic statistical fluxuatlons which 
are generally difficult to model. This is particularly true of data 
which Is one or more cf 

(a) data sampled In a particular scan line direction and 
subjected to significant band-width limited processing 
after sampling; 


-U; ; r~'- 
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(b) p1ctor1«1 Imagery with sampling at a density comparable 
to the point-spread of the lens system; 

(c) multi -temporal Imagery In which Imperfect registration 
has been performed: that Is, multi -Ipnagery In which 
spatial adjustments have been made so that pixels frtxn 
various single Images are samples from approximately 
the same spatial point. 

(It can be noted that Landsat multi -ten^oral multi -spectral data enjoys 
all of these properties.) For digital multi -Imagery, we can distinguish, 
at least in principle, between mixture pixels and pure pixels (the rest). 
Mixture pixels arise as a consequence of finite bandwidth ((a) or (b)) or 
Imperfect registration (c). If the data Is sampled efficiently and real 
classes are found In small groups then many pixels will be mixtures. 

On the other hand. If It Is to be believed that an adequate spatial 
sample Is available, then each real class must be represented In at least 
some pure pixel associations. In order to make this more precise, we 
Introduce some terminology. 

Let I denote the digital Image. The next three assumptions concern 
the existence of a set Pci of pixels such that neighboring pixels 
in the set are unusually like one another In measurement space. Call 
two pixels witfi »p*t<al coordinates (1,j) and (n,m) neighbors If 
|1-n|+|j“m|»l. (A pixel Inside the Image has four neighbors.) 
A path Is an ordered sequence p^,...,p^ such that P|^_-| Is a neighbor 

of Pj^ for k - 2,...,n . A set Q Is said to be connected If for 
each pair p,'q In Q there Is a path p^,...,p^ In Q with p^ - p 

and * <1 • It Is easy to see that any non-void Pci Is a union of 

non-void maximal connected sets , 1 - l,...,k with D Qj = 0 

for 1 f J ; the cowoonents are uniquely determined. 

We assume that: 

A2 A subset P of T has the property that each pixel p e P 
Is a pure measurement frrm a real class. 

Call the components of P patches ; consequences of the purity assumption 
and the discussion above on sampling are the following two statements. 
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Rather than formalize the sampling discussion, we simply assume: 

A3. All pixels from a given patch are measurements from the 
same real class. 

A4. Each real class has at least one measurement pixel In P . 
Group B 

Consider a clustering C * {Cq,. .. of the data. (It Is con- 
venient to Include a cluster Cq which one might call "unknown"; most 

classification rules allow a threshold in which a pixel Is not assigned 
to any cluster.) In what follows, we call a clustering of a pixel p 
in meaningful If and only If 1 > 0 . Consider also the unknown 

real partition of data Into pure real classes {Ri,...,R|^} plus a 

mixture class R^ . These are not simply unknown: they depend on the 

observer, and so are unknowable. In clustering, one might hope to 
minimize the "probability of misclassificatlon." Unfortunately, since 
the clusters are not labelled and. Indeed, no labels Independent of an 
external observer exist, this concept Is meaningless. 

One observation we can make right away Is the following: it is 

clearly an error If p e R^ for 1 j* 0 and p e Cq . This Is actually 

a restriction on the "rejection thresholds", and is used 1r* i^M0EBA to 
determine when the clustering Is going astray. Here we simply assume 
Rj n Cq = 0 when 1^0. 

Consider a pair {p,q} of pure pixels. Let r(s) denote the 
real class a pixel s Is In and c(s) the cluster. Since c(p) i 0 


and c(q) ^ 0 , p and q 

are clustered In meaningful clusters, and 

there are 

four cases: 



(1) 

r(p) » r(q) 

and 

c(p) = c(q) ; 

(i1) 

r(p) i r(q) 

and 

c(p) i c(q) ; 

(111) 

r(p) = r(q) 

and 

c(p) i c(q) ; 

(1v) 

r(p) i r(q) 

and 

c(p) = c(q) . 

Th« last 

two cases ar<; 

errors. 
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81 The probability that two pure pixels are In the 
same real class and are clustered differently plus the 
probability that two pure pixels are In distinct real classes 
and are clustered alike Is called the gair prpbablljl^ of 

ml s cl u s te r 1 nj { PPMC ) . 

82 pyectlye* In clustering pure pixels. It Is desired that 

(a) each pure pixel be assigned to a meaningful cluster, and 

(b) the PPMC Is minimal. 

Before examining just how these assumptions arc developed Into a 
clustering program, we consider the problem of handling mixture pixels. 

Group C 

The underlying classification rule used In Af^EBA is a nearest 
neighbor (Euclidean distance) to cluster center. A clustering program 
based on the model discussed above furnishes cluster centers. The pixels 
are tentatively classified by nearest cluster center and this classifica- 
tion Is checked spatially. We assunx?: 

C.l.a. The nearest cluster center classification Is generally 
accurate. 

b. Each pixel with two, throe or four spatial neighbors In the 
same class Is acceptably classified. 

c. Each pixel with no neighbor In the same class Is not 
correctly classified. 

Assumption C.l allows us to locate and mark pixels with one or no neighbor 
In the same class for examination. Most of these pixels are boundary pixels. 
To model this situation, we assume: 

r..’ Each pixel on a spatial boundary Is, as a measur»nent vector, a 
convex combination of the cluster centers In which two of Its 
four neighbors are classified. 

Although this ntodel Ignores both contaminated boundaries and 
registration errors. It leads to a method for reclassifying apparent er»*ors 
and. unexpectedly, to a cluster-dependent rejection threshold. Note that. 

It b - op ♦ (1 - u)q is a convex combination of vectors p and g , 
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then dist(b, nearer of p,q) < ^ dist(p,q). Suppose p is a pixel which 

was marked as having no neighbor in the same class. Let q.j, q 2 . q^ and 

q^ be the cluster centers of the classes of the four neighbors of p 

which are acceptably classified. (Usually at most two are distinct, and 
often only one neighbor is in a valid cluster at this point.) For each 

cluster i , let r(i) = dist(c^, Cj) denote half the distance from 

cluster center c^ to the other furtherest away. Reclassify pixel p 

in class q. provided dist(p.q^) < r(i) and dist(p,q..) < min {d : d = 

’ ^ ifj 

dist(p.qj) and d < r(j)}. 

The IDIMS function AMOEBA represents one atten^t to follow this 
model as far as it can take us. We only make two concessions to reality: 
First, the boundary estimation pi'ogram is good but not perfect, so we 
do not actually classify patches as one unit. Second, registration errors 
blow the mixture /I high (that is, in registration-error pixels, the a 
depends on the band, but, even so, the distance to the closest is more than 
• 1/2 d(p,q) ^ .7 dist(p,q)). 
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APPENDIX B 
SYSTEK SUBROUTINES 


* ATHWDS 

Authorize work data sets (temporary Images) 

* CHKIO 

Check for errors after an I/O is performed 

* CLOSEP 

Close a picture file 

* DELMDS 

Oeauthorize work data sets 

** FCHECK 

Check for I/O errors 

** FCLOSE 

Close a file 

** FOPEN 

Open a file 

** FWRITE 

Write a record 

* OPENPI 

Open the input image disk file 

* OPENPO 

Open the output image disk file 

* Pa rams 

Prompt for user parameters 

* PRINTP 

Print a message 

* READP 

Read a portion of an image 

* WRITEP 

Write a portion of an image 


* Supplied by ESL. Reference "IDIMS Applications Programmers Guide" 
ESL-TM1047 

** Supplied by Hewlett-Packard. Reference "MPE Intrinsics Reference 
Manual" Part No. 3000-90010. 
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APPENDIX C 

IDIMS USER DOCUMENTATION 
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AMOEBA 

A. PURPOSE 


Performs completely unsupervised clustering and classification of 
a multispectral Image using a spatial-spectral clustering algorithm. 

B. INPUT AND OUTPUT 


The Input Image must contain between 2 and 16 bands, and must 
contain no measurement less than zero or greater than 127. The output 
Image is of type BYTE. 

C. PARAMETERS 


There are 7 optional parameters. They are: 

STATFILE » Statistics file name (alphanumeric character) that output 
statistics data Is to be stored In. 

PCTFLDS » The user's estimate of the percent of the Image contained 
in "fields" — spatially connected spectrally homogeneous 
areas; (Integer) 

Default =* 45 

CHAN IMAP* - Shall a map of band one of the Image be sent to the user? 
(character) 

Default = 'N' 

I.ABELMAP* » Shall a map of labels be sent to the user? (character) 
Default = 'N' 

CLASSMAP* = Shall a classification map be sent to the user? (character) 
Default » 'N' 

MASK - Shall a value of 0 In band 1 be taken as a mask (i.e., net 
part of the image)? (character) 

Default ■ 'Y' 

MINCLUS = User's desired minimum number of clusters. If negative, 

exactly -MINCLUS clusters will be sought. If positive, at 
least MINCLUS clusters will be sought (Integer) 

Default ■ 10 

MAXCLN - User's desired maximum number of clusters. May not exceed 
98. 

Default * 98 

D. EXAMPLE 

INIMAGE > AMOEBA > OUTCLUST PRECEDir4G EL-A/.K HCT FILMED 
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*These are primarily debugging aids to give the user a quick look at the 
data and follow the progress of the function. Additional parameters PRINTSL 
(starting line), PRINT NL (number of lines), and PRINTSS (starting sample) 
are required. 

After prompting for parameters, messages to the user will appear 
as follows: (assume the Image Is NB bands and all default options 

are taken) . 

INTTHR » (list of boundary detection thresholds) 

#LABELS * (number of distinct "fields" found; a field is defined 
as a connected area of non-boundary) 

#TSTSTS * (number of "test sets" found. A test set is a set of 5 
pixels collected from the same field) . 

Minimum number of clustetci sought: 10 start with nn clusters, nm 
test points. ^ clvisters have non void assignments. 

Square of diameters of starting clusters: sss 

Number of clusters; cc 

Estimate of Pair PMC: ££ perce.it. 

Final number of clusters - ff 
(number in 1) ("center" of 1) 

(number in ff) ("center" of ff) 

There are ^ unclassified. 

The mask contains aa points. 

End function — AMOEBA 

The meaning of most of these outputs is explained in the algorithm 
documentation (F). The principal user output is the list of clusters 
"centers" (attractors is probably a better term) and the number of image 
elements assigned to that center. 

E. DIAGNOSTIC MESSAGES 


There are five messages AMOEBA may return: 

]. Your image contains a value over 127. Please use MAP to put into 
the range 0-127. FUNCTION DOES NOT SUPPORT INPUT DATA TYPE 

A value over 127 was encountered. 

2. NUMBER OF BANDS SPECIFIED NOT ALLOWED BY FUNCTION 
Number of bands must be at least 2 and at most 16. 

3. EXTERNAL FILE COULD NOT BE ACCESSED 
The STATFILE name is already in use. 

4. SPECIFIED NON- IMAGE FILE PREFIX INVALID 

The STATFILE name contains more than 8 characters, or is otherwise 
invalid. 
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5. INPUT IMAGE SIZE EXCEEDS FUNCTION CAPABILITIES 

The InpuC Image Is too wide for even one buffer on input and one 
on output. Use MOSAIC to segment the image into strips. (The 
number of bands and number of lines do not matter, only the nundjer 
of samples). 

Additional messages may be returned if I/O problems are encountered 
during operation of the function. 

F. ALGORITHM 


The clustering and classification function AMOEBA is based on a 
simple model for image data. In the model, the concepts of boundary, field, 
and classification are defined, and assumptions are made about the accuracy 
of a clustering in terms of the classification. The classification is 
based on a spatially modified nearest neighbor classifier (Euclidean distance) 
to train such a classifier, one needs to know only the number of classes 
and the class "centers". The actual function proceeds in steps as 
follows; subroutine names are given in parenthesis: 

Find boundary detection vector thresholds (THRFND) : 

Boundary detection is based on vector gradient thresholds (rather 
than a norm threshold or other one-dimensional decision rule). Thus one 
threshold is determined for each band. Initially the image is sparsely 
sampled to estimate with each band contributing about the saaie number of 
boundary decision estimates. These thresholds are passed to the next 
step. 

Find connected sets of non-boundary (START) : 

The data is scanned three lines at a time (in a circular buffer) , 
and a circular buffer of labels is created. A boundary decision results 
when a point and its neighbor differ by more than the threshold in any 
band. "Cracks" are filled in, and intervals are located on each new 
labels line. These are then connected to the previous line of labels, and 
the previous line written to disk. The resulting intermediate image 
contains -32768 (the smallest 16 bit two's complement Integer) marking 
boundary, or n (which starts at -32767 and is incr«auented) labeling 
connected sets of non-boundary. ILABELS ■ n is printed. 

Extract test sets and store on disk scratch file (ASELECT) 

The labels map and data are scanned, and as large a buffer as there 
is memory for is allocated to accumulate samples bearing the same label. 

When the buffer fills (or at end of the data), each same-label batch 
is sampled, taking every fifth point (from batches with at least 5), and 
the sample secs are stored on disk. There can be as many as 64K-1 such sets. 
These are passed to the next step. Their nunber is printed (ITSTSTS) . The 
temporary labels map is deleted from disk. 

Thin teat sets and accumulate starting clusters (THINTSTM) 

The starting clusters are to be means of samples taken from the same 
component of the complement of the boundary. Accordingly, they should 
be spectrally purer, since this tends to minimize registration errors 
and reduce noise. However, it is out of the question to classify 10,000 
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things In 2,000 classes, and then may be even more than this many test 
sets (l.e., more than 2,000) In a large Image. We therefore reduce the 
number by (a) removing apparent duplicate means, (b) removing test sets 
In vhlch the first sample Is relatively unlike the last, and (c) removing 
apparent duplicate test sets. The final number of test sets is printed 
in the next step. The temporary test set file is deleted from disk. 

Find the clusters and their number (NUMCLU) 

First sort the test sets In Increasing order based on the sum of 
the odd channel values. Samples from the same test set are in the same 
"field", and therefore are from the same real class (on the model assumptions). 
Samples spread out In this order tend to be from different real classes. 

Errors are made by the classifier when a center attracts points from 
different real classes or participates In the splitting of a pair from 
the same test set. Therefore, centers which make errors are eliminated. 

A running estimate Is kept of the probability that a pair from different 
classes Is clustered alike, plus the probability that a pair from the 
same real class Is clustered differently. The minimum value of this 
estimate of the Pair PMC is used to determine the number of clusters and 
exactly what they are. 

Classify and count (CLASSIFY) 

A spatially modified nearest neighbor classification is now performed. 
Initially, each point is classified by nearest neighbor. Then this 
classification Is checked for accuracy by looking at the classification of 
the four nearest neighbors. Points with one neighbor In the same class 
are deemed OK!. Not~0K points are examined i>-th the view of reclassification 
In the class of OK - neighbors, provided this can be -one consistently with 
the anticipated spectral appearance of a mixture plx-^ . N? reclassification 
based on spatial content alone Is performed. Tha’- rs. all reclassification 
must fit the mixture and registration error model. Circular buffers are 
managed (similarly t") START), and the checked nearest neighbor classification 
Is written to disk as a type BYTE image. 

Finish (AMSTATS) 

The optional STATFILE Is written. 

G. COMMENT 


The user Is advised to be cautious about Interpreting any clustering 
of Image data. Many Images, Indeed, a'*e not suitable for clustering. If 
AMOEBA is selected, the output parameter Pair PMC Is a good Indication of 
accuracy: more than 25 percent and the area was probably poorly clustered. 

Under 20 and clustk^rlng was at least self-consistent. 

Users wanting to learn more about the method or the uncer lying model 
should consult the reference. 

H. REFERENCE 


Jack Bryant, "On the clustering of multidimensional pictorial data". 
Pattern Recognition 11, pp. 115-125 (1979). 


131 


APPENDIX D 

SAMPLE INTERACTIVE SESSION 
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APPENDIX E 
SAMPLE BATCH JOB 

1. Create, using the Editor, and store the following file: 

!JOB jobcard 
! I DINS 

1 nputi mage>AMOEBA( parameterl 1st) >output Image 

>END 

!EOJ 

2. Exit the Editor and enter 
1 STREAM filename. 

3. Return later for results. 

In some systems, it may be mandatory to store the input image (or perhaps 
both); someone should be around to respond to requests to hang tapes. 
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RICE SCENE RADIATION RESEARCH PLAN 


The goal of rice scene radiation research Is to develop an under- 
standing of the functional relationships between rice and Its spectral 
characteristics. These functional relationships will be Inteorated 

Into spectral - agroineteorological models for use In crop Identifica- 
tion, development stage estimation, and condition assessment. 

CROP IDENTIFICATION 
Introduction 

Knowledge of the cultural and biophysical characteristics of 
crops and their relationships to spectral response are In^ortant 
inputs to the pattern recognition research effort. For crop Identifi- 
cation, this research will provide Information on what crops can and 
cannot be separated using the current and planned sensor technologies, 
what additional kinds of measurements are needed, and the Important 
times and frequency of observations needed to enable crop discrimina- 
tion and identification. For the sampling and estimation research 
effort, knowledge of the cultural and biophysical characteristics of 
crops which significantly affect spectral response is needed in order 
to account for the agronomic factors of )mportance In an advanced 
dynamic sampling and estimation approach. 

Technical Issues 

Scene radiation applied research issues in crop identification 
which have been defined are: 

1. What are the key cultural and biophysical characteristics of crops 
(which are potentially observable from remotely sensed data) that 
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permit separability between crops and Identiflabll Ity of crop 
types at harvest and earlier In the season? 

2. How are the cultural and biophysical characteristics related to 
crop type manifested In the spectral response observed by existing 
and planned sensors such as MSS, TM, and other advanced sensors? 

3. What new kinds of observations are needed to Improve the estimates 
of key crop characteristics shown to permit separability between 
crops and Identiflabll Ity of crop types? 

Technical Approach 

Crop identification research for rice Involves: field research, 
canopy modeling, and Landsat data analysis. Field research data will 
Include Intensive agronomic and spectral measurements. Canopy geome- 
try measurements and available information on leaf reflectance and 
transmittance will enable modeling of canopies and thus increase the 
range of canopy conditions which can be studied with confidence. Use 
of field measurements and canopy modeling will enable extension of 
that knowledge to the relationship with Landsat MSS and TM data. 

Task Descriptions 

Identify Cultural and Biophysical Characteristics 
Related to Crop Identification 

The first technical issue has the following specific objectives: 

* Determine the key difference between rice and its confusion crops 
in the timing and duration of key physiological and cultural 
events. 

* Determine the key differences In canopy geometry among rice varie- 
ties and the optical properties of catiopy components. 
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* Determine the cultural, regional, and environmental differences 
among rice and its confusion crops. 

* Represent the distribution of the key crop characteristics by func- 
tional forms. 

A literature review will be conducted to identify planting dates, 
regional crop calendars, soil surveys and other descriptions of man- 
agement practices. The periodic observat'<ons acquired on sample seg- 
ments In the United States will also be used to help detail management 
practices (e.g. row width, planting dates) and provide more localized 
information on planting date and development stage. 

When the data have been compiled, the crop characteristics will 
be related to geographic region and to other scene characteristics and 
nwnagement practices. Functional forms will be found which describe 
the distributions within and across the geographic regions. 

Relate Agronomic Characteristics to Spectral Response 

The second objective is to determine how cultural and biophysical 
characteristics affect In the spectral response of rice observed by 
existing and planned sensors such as MSS, TM, and other advanced sen- 
sors, and has two sub-objectives. 

* Determine the relationships among key crop characteristics, remote 
sensing observables (band means and transformations with current 
and planned sensor systems), and background effects (e.g., soil and 
water background, atmospheric haze, view angle, and sun angle). 

* Determine which remote sensing observables are predominantly a 
function of the crop characteristics of interest and are least sen- 
sitive to background effects. 
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A set of remote sensing observables including MSS bands, tas- 
sel led cap components, MSS band ratios (e.g. , 7/5), TM bands, trans- 
formations of TM bands, and bands of other sensors will be examined. 
For all sensors, research into appropriate bands or transformations 
for estination of particular canopy characteristics will be conducted. 

The relationships of these remote sensing observables to crop 
characteristics and to scene characteristics which are not of interest 
will be examined. To do this, both field-acquired data and data 
modeled using canopy geometry and leaf reflectance and transmittance 
measurements will be used. Correlations, regressions, and linear and 
nonlinear irodels will be used as appropriate to describe the relation- 
ship of the remote sensing observables to and the amount of variabil- 
ity due to: greer leaf area index, percent soil cover, green biomass, 
development stage, and temporal trajectories. 

After the functional relationships have been determined, sensi- 
tivity analyses will be conducted on the variables of interest to 
determine the change in spectral response given a certain change in 
the canopy variable. This will enable determination of which canopy 
and background variables are important in determining spectral res- 
ponse. 

Finally, the "best" bands or transformations for each of the sen- 
sors will be determined to be those which maximize sensitivity to var- 
ious individual crop characteristics and minimize sensitivity to 
undesired effects. The crop discrimination power of these sets of 
remote sensing observables will be tested using multivariate analysis 
on data from one or more intensive test sites. 
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Investigate Potential Improvements Due to New Data Types 

The third objective Is to determine what new kinds of observa- 
tions are needed to Immprove the estimates of key crop characteristics 
shown to permit separability between crops and identlflabll Ity of crop 
types. The specific objectives addressing this Issue are: 

* Determine the functional relationships among key crop characteris- 
tics, background effects, and spectral response observable In other 
spectral regions or with other types of measurements. 

* Identify new data types which Improve the relationships with key 
crop characteristics used for crop discrimination while minimizing 
background effects. 

To address this Issue, spectral measurements must be acquired In 
the field and over test sites with sensors other than the current and 
planned sensors. Helicopter spectrometer and/or aircraft scanner data 
covering other visible and near-to-middle IR regions, thermal measure- 
ments, microwave measurements, and 11 lumlnation/vlew angle measure- 
ments are required. The approach for addressing this Issue will 
parallel that of the second Issue except for the measurements util- 
ized. 


Data Requirements 

The selection of treatments consists of first Identifying the 
major sources of variation In the growth, development, and spectral 
response of rice. These factors Include; planting date, variety, 
plant population/row spacing, soil conditions, and weather. The 
levels of each factor will be selected to sample the range of expected 
conditions In commercial fields. 
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Spectral nveasureinents will be made In controlled plots using the 
EXOTECH lOQA radiometer and the Barnes multiband radiometer system 
(having the TM bands). Detailed agronomic measurements of the crop 
canopies Including crop development stage, vegetation measurements, 
crop condition, soil background condition, and grain yield will be 
collected. 


DEVELOPMENT STAGE ESTIMATION 
Introduction 

Crop development stage Is Important for crop Identification and 
yield modeling. There are three approaches for estimating crop 
development stages: (1) average crop calendars based on accumulation 

of days between stages, (2) meteorological methods based on 
accumulation of thermal or photo-thermal units between stages, and (3) 
spectral methods based on changes In spectral response as a function 
of development stage. The goals of this task are to Investigate the 
use of spectral measurements to determine crop development stage and 
to develop a meteorological ly-dri ven stage of development model that 
will accept spectral Inputs. 

Technical Issues 

Research issues for rice development stage estimation are: 

1. What are the key biophysical characteristics of crops (which are 
potentially observable using remotely sensed data) that permit 
their development stage to be determined? 

a. What are the critical development stages of crops with respect 
to crop Identification, condition assessment and yield 
prediction? 
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b. What are the key differences in timing and duration of key 
developmental events? 

c. What are the key differences in canopy geometry and composi- 
tion related to development stage? 

d. How do these differences depend on cultural, environmental and 
geographical factors? 

e. What are reasonably representative functional forms for the 
distribution functions of the key crop characteristics? 

2. What are the functional relationships between development stage 
and the radiometric characteristics of crop canopies? 

a. MSS bands 

b. TH bands 

c. Transformations of MSS and TM data 

d. Other sensors 

3. How are the functional relationships affected by cultural, envir- 
onmental and geographic factors (e.g. variety, row width, soil 
type, moisture stress)? 

4. How can spectrally derived development stage information best be 
utilized? 

a. Development of models which, given spectral plus weather data, 
predict development stage 

b. Development of models which, given development stage, agromet 
conditions, and canopy geometry, predict spectral response 

5. What is the improvement in performance of large area crop growth 
and yield models by using spectrally derived inputs (i.e., evalua- 
tion of models in the context of a large area crop yield model). 


7 


Technical Approach 


The general technical approach for addressing the research objec- 
tives In the crop development stage area will Involve estimation 
theory. The radiometric characteristics of rice canopies will be 
modeled to determine functional relationships with development stage 
and/or time. This development will rely on both ground and satellite 
measured spectra! data In the MSS and TM bands, and meteorological 
data. The trajectories of the development stage of crops in spectral 
space will be analyzed to Identify variables with superior properties 
for estimating development stages of rice. Various estimators will be 
examined Individually and together to determine their predictive 
abilities. 


Task Description 

Four research tasks must be completed In the area of estimating 
crop development stage. Agronomic Information of rice must be 
obtained describing the key biophysical characteristics that permit 
development stage to be determined. The necessary Information will be 
obtained from technical literature, Texas AiM agronomists and field 
measurements. The key biophysical characteristics are needed to gain 
a physical understanding of problems associated with using spectral 
measurements to estimate stage of development. 

The second research task Involves an analysis of multiyear agron- 
omic, spectral, and meteorological data. The development stage tra- 
jectories of rice will be examined In spectral /agronomic space to 
Identify worthwhile spectral estimators of crop development stage. 
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The third research task Involves development of an agrometero> 
logical stage of development model that accepts spectral Inputs. The 
model will be developed using agronomic and meteorological data 
obtained from rice experiment stations In three different climatic 
regions. The model will be developed to obtain a high degree of 
accuracy for the three maturity classes of rice. 

The fourth task will consist of development of a framework for 
merging the spectral estimators of stage of development with the agro- 
meterological model so that the spectral estimates of growth stage 
can be used to “correct" the model estimates. If necessary. 

Data Requirements 

To Identify the form of relationship between crop development 
stage and spectral variables and to develop Initial models, reflect- 
ance measurements and observations of development stage at all growth- 
development stages for a representative set of cropping practices and 
soils are required. 

The specific data requirements are: 

- Reflectance measurements In the Landsat MSS bands and TM bands. 

- Rice develoment stage observations. 

- All growth and development stages from pre-planting to post-harvest 
sampled. 

- Frequency of observations at 5-7 day Intervals. 

- Representative treatments above sampled (i.e., several soil types, 
varieties and planting dates) 

- Dally meteorological data, temperature, relative humidity, solar 
radiation, precipitation, etc. 

- Atmospheric measurements on days spectral data are acquired. 


After Initial model forms have been developed, a larger data set 
Is required to test and evaluate the models. This data set should be 
acquired over 3-5 additional domestic and International experiment 
stations at locations having difference soils, weather, and cropping 
practices. 


CROP CONDITION ASSESSMENT 
Introduction 

Potentially, multlspectral data contains additional Information 
about crops other than Identification. Relatively little research has 
been conducted on developing and exploiting the capability of multl- 
spectral data to provide Information about crop condition and yield. 
For example, the ratio of near Infrared to red reflectances and the 
greenness transformation have highly significant relationships with 
leaf area Index (LAI). Agronomic variables, such as LAI and percent 
soil cover, are frequently required Inputs to crop models which depict 
limitations Imposed on crop growth and yields by weather. Additional- 
ly, measures of the presence and degree of stress, such as moisture 
and nutrient deficits and disease and insect infestations, are poten- 
tially Important Inputs to crop growth and yield models. Thus, If 
agonofflic variables related to yield could be reliably estimated from 
multlspectral satellite data, then physiologically-based crop growth 
and yield models can be Implemented for large areas. 

Technical Issues 

The research and development program to assess crop condition and 
provide inputs to yield models will address the following issues: 
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What are the Important biophysical variables related to the 
condition and yield of rice? Which of these variables can be 
potentially estimated using remotely sensed data? 

- How are the functional relationships between spectral variables and 
biophysical parameters of rice affected by variation In soil 
background, crop production practices, and environmental 
conditions? 


Technical Approach 

Field measurements which Include stress (temperature and mois- 
ture) treatments will be used to determine the effect of stress on 
biophysical factors. The Suits reflectance model will be used to pre- 
dict changes In reflectance due to the changes In the biophysical 
characteristics of rice. This Information will be used to support 
crop identification and to develop techniques for using spectral esti- 
mates of crop condition In agrometerological yield models. 

Task Description 

The two research tasks In condition assessment are to determine 
the key biophysical description of crop condtion that can be observed 
by remotely sensed data, and two determine how these characteristics 
can be observed using remotely sensed data. These tasks will be 
addressed using literature reviews, historical data analysis, and 
field measurements. 

Data Requirements 

Data requirements are the same as In Development Stage Estima- 
tion. 
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POSSIBLE MODIFICATIONS OF 
THE HISSE MODEL FOR PURE 
LANDSAT AGRICULTURAL DATA 


i>y 


Charles Peters 
Department of Mathematics 
University of Houston 
Houston, Texas 


SUMMARY . 

This report explores an idea, due to A. Feiveson, for relaxing the 
assumption of class conditional independence of LANDSAT spectral measurements 
within the same patch (f'eld). Theoretical arguments are given which show 
that any significant refinemetit of the model beyond Feiveson* s proposal will 
not allow the reduction, essential to HISSE, of the pure data to patch summary 
statistics. A slight alteration of the new model is shown to be a reasonable 
approximation to the model which describes pure data elements from the same 
patch as jointly qaussian with a covariance function which exhibits exponential 
decay with respect to spatial separation. 
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1 . Basic HISSE Model and Its Modifications . 

The original mathematical assumptions underlying HISSE are fully described 
In r?]. Briefly, they are: 

a) The sampled pure pixels are organized Into p patches (fields) 
and corresponding to each patch j, there 1s a set of spectral 


data measureme’^ts Xj = 


» ), where X 


JN 


1 


jk 


Is the 


(perhaps multitemporal ) vector of spectral data from the pixel 
in the jUi patch. For each patch j, there Is also an unknown 
class designation 6. e {l,--.,m}, where m 1s known. 

J 

b) 1^2 are treated as independent random variables. 

J J J ’ 

The 0. have a common unknown discrete distribution 
3 

m 

Prob [0i = ill = a. >0, where I,ao = 1. 

J )l=l 


c) Given that = t, X^p •••, X^j^ 


are Independently normally 


distributed with unknown mean and unknown variance-covariance 
matrix . 

A proposed modification due to A. Feiveson [3,1, introduces one additional 
matrix parameter for oach class. Assumption (c) is changed to 


c') Given that Gj = where E(ej) 

= E(dji^) = 0, var (ej) = var (d^^) = and the 

e/s and d^j^'s are independent normal random variables. Thus 

the elements Xj.j , • ••, X^.j^ of Xj are jointly normal with 

J 

marginal distributions X.^^ ~ and constant 

within-patch covariance ccv(X.. , X..) * In, for k i . 

JK ji *■ 
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Notice that the original assumption (c) is a limiting case of (c‘) obtained 
by allowing = 0. 

For reasons discussed later, we will alter (c') to 

(c") The constant within-patch covariance for elements of the 
jth patch is cov 

The effect of (c") is that data elements from large patches are considered more 
weakly correlated than those from small patches. Assumption (c') is perhaps 
more appropriate if the correlation between pixels of the same patch is really 
independent of their spatial separation, while (c") is better if the correlation 
falls off rapidly with spatial separation, on account of the preponderance of 
spatially distant pairs in larger patches. Calculation*^ re presented in Section 
4 to suggest that (c") is a reasonable approximation t. the average covariance 
between pairs when the correlation decreases exponentially with spatial separation. 
In Section 3 theoretical arguments are given which severely restrict the covariance 
models for which the patch mean vector and scatter matrix are sufficient statistics 
without, however, eliminating (c') and (c"). This is an important consideration, 
since procedures like HISSE are feasible only if the spectral information in patches 
can be summarized in a small number of statistics. 


2. Numerical Procedures for tti..- A lternative Covariance Model s . 

The likeMhood function and iterative procedure for the current version of 
HISSE are given in [7] and will not be repeated here. For covariance models 
{c') and (c"). The likelihood functions is 


L = j log Z = ,?.log f{X.) 


J=1 


?.=1 


j=l 


2 


original PACc (3 
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whf.re the model (c') 

-N.-l , 

-J.- . 

fj^(X.) = Itjl ^ ^ exp[- 1 Qi(Xj)l 


while for model (c") 


f,(Xj) 


-N.-l , 

W-^l ^ * Ejl ^ 


exp[- ^ 


and q;(X.) = tr«;'s. ♦ N^U (n,_,-Uj) . 


In both these expressions m. and S. are, respectively the patch mean and 

J J 


scatter 


Nj 

.k = l^j*^ 
N. 


Thus for both of these covariance models the patch mean and scatter are jointly 
sufficient. 

The unconstrained likelihood equations for model (c") have the form 


(1.1) 


','1 - 


H ' 

P j=l 


U,f,(X ) 

-flx/- 


( 1 . 2 ) 


P 

j: N. 

j=l J 


fj(x ) p 


-f^ 
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(1.3) 


'I'd = 


P p f,(Xj 


( 1 . 4 : 


! 1 . 


P 


T. f 


/jii-fTry 

J J 


where the new pardineter is defined as 

The expressions on the right of equations (1.1) - (1.4) are appealing in 
that they are averages of quantities whose expectations, given 9. = t, are 
the parameters on the left. In addition, the successive substitutions scheme 
suggested by equations (1.1) - (1 .4) is a slight variation of the generalized 
E-M procedure cf Dempster, Laird, and Rubin [22. For covariance model (c'), 
the likelihood equations uo not sugg&,>t a natural Iterative procedure and it 
appears that the generalized E-M procedure has no simple formulation. 

To be consistent with the original interpretation of the parame;er T.^ 
as a variance-covariance matrix, it is necessary to maximize the likelihood 
subject to the additional inequality constraint ^ Since a solution of 
equations (1.1) - (1.4) need not satisfy this constraint, maximizing the likeli- 
hood subject to requires a much more complicated numerical procedure. 

The condition is equivalent to a set of scalar inequality and nonlinear 

equality constraints, and numerical procedures for such problems are generally 
very slow to converge. The unconstrained maximum likelihood procedure 1s 
appropriate If in (c") we merely assume that cov (Xj^.X^j^) Is the same for all 
i and k, without introducing random variables e^ and d^j^. 

C ovariance Models for which patch mean and scatter are suffic ient . 

Let X = (X-| I • • • IXp^)^^l^ be a matrix whose columns are Jointly normally 
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distributed n-vectors. We are interested in characterizing those families 
of distributions of X for which the statistic (m,S) is sufficient, 
where ni = + • • ■ + Xj^ and S = X^x| + ••• + X^^XjlJ. We begin by recalling 

the following definitions [4, p. 321. 


Definition : Let G be a group of homeomorphisms on A function T 

defined on ^ is invari ant under G if T(gx) =• T(x) for all x cjj”, g e G. 
T is a maximal inv a riant of G if T is invariant and T(x) = T(y) implies 
that there is a g e G such that y = gx. A measure A is invariant under G 
if Ag = A for all g f G, where Ag(E) = A(g(E)). 


D N 

Lemma 1 : Let elements of be represented as x = (x.| I • • • |Xj^) and let 

e^ = ’ ■ ■ ■ ^IxN* N X N real orthogonal matrix u satisfying 

ue = e, let 9 y(x) = xu. Then T(x) = (m,S) = (xe.xx^) is a maximal invariant 
of the group G = (g^) . 


Proof: ”r(g^x) = (xue, xu(xu)^) = (xe, xx^) = T(x). Thus T is invariant. 

Suppose that T{x) = T(y) so that xe = ye and xx^ = yy^. If x^^^ and y^^^ 
denote the i^ rows of x and y then and x^^^e = y^^^e 

for all i and j. This implies that corresponding rows of x and y have tho 
same Euclidean norm and form the same angle with the vector e^. In addition, the 
rows of X describe the same set of angles in as do the corresponding rows 
of y. Thus, by carrying out parallel Gram-Schmidt procedures on 


r T (1) 

le , x' 


(n) 


1 and {e^, •••, y^^’h, it is easy to construct an 


orthogonal matrix u such that e^u = e^ and x^^^u = y^^^ for each i; 


that 


is, such that y - g^x. Therefore T is a maximal invariant. 
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Example : Any linear function T defined on ^ Is a maximal invariant under 

the group of translations by elements of the kernel of T. In fact, most of 
the results In [61 characterizing linear sufficient statistics depend only on 
this aspect of linearity. 

If T Is a maximal Invariant then any Invariant function on Is a 
function of T(x). Moreover, a function h o T on is a maximal Invariant 
if and only if h is one to one on the range of T. In the theorems which 
follow we shall require that T be a continuous open mapping. In addition 
to being a maximal invariant. The following leimna shows that to some extent T 
may be chosen for convenience, with affecting the property of openness. 

Lemma 2 ; Let V be an open subset of ^ let G be a group of homeomorphisms 
from V to V and let and be continuous maximal Invariants of G 
defined on V with values in , If is an open mapping then so Is T^. 

Proof : Since T 2 and are maximal Invariants, there is a one to one 

function h:T^{V) -*• T 2 (V) such that = hT-j . Since h’^ = on T^CV), 

is continuous and is open, h Is continuous. By the Brouwer invariance 
of domain theorem [8, p. 31 h is an open mapping. Therefore, is also open. 

The orem 1 : Let V be an open subset of . let be a homogeneous collection 

of finite Borel measures on jjlj' ^ and let X be a fixed element of?/i. Suppose 
that X(V^) = 0 and X(U) > 0 for each nonempty open subset U of V. Let 
G be a group of homeomorphisms from V to V such that X(gB) = 0 whenever 
X(B) = 0 and g c G. Suppose that f^ Is a continuous representative of ^ 
for each y €'•;( and that T:V -*■ flC Is a continuous open maximal Invariant of 
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G. Then T is a sufficient statistic for 7’"i if and only if each f^ is 
invariant under G. 

Proof : Suppose that T is sufficient. Then for each ii e there exists a 

Borel measureable function such that k,^*T is a version of dy/dX, [1]. 

Let 11 and g . G be fixed. The set 

U = fx.Vlf^^(x) f^(gx)} 

is an open subset of 8 u g'^(B), where 

B = fx, V|f^(x) ^ k^/T(x))]. 

Since X{B) = 0, X(g~^(B)) = 0 and x(U) = 0. Therefore, U is empty and it follows 
that f^ is invariant. Conversely, if each f^ is invariant, then for each 
u V.lf there exists a function h such that f = h *T. Since f is continuous 

'■ y U U y 

and T is open, h^^ is continuous on T(V). Therefore, by [1, Corollary 6.1] 

T is sufficient. 

Corollary 1 .1: Given the hypotheses of Theorem 1, if X is invariant then T is 

sufficient if and only if each y . is invariant. 

Proof : In general, a density with respect to X of yg is f^^ = (f^og)h, 

where h is a version of dXg/dA. If X is invariant, then we can take h = 1 

to obtain f = f ^9 as a unique continuous density of yg, for each y,g. By 

My M 

Theorem 1, T is sufficient if and only if f^^^ * f^, which is equivalent to 

yg = y. 
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Suppose that pg < for each p * g € G and that 6 Is an 
r-d1mens1onal parameterization ofTif ; i.e., a one to one function from 7^^ onto 

= 0(W) c Then there is a homomorphism g -► g from G onto a group G 

of transformations on defined by g(e^) = e(0’\o^)g). The following corollary 
is clear. 

Corollary 1 .2 . Given the hypothesis of Theorem 1, if \ is invariant then T 
is sufficient iff G is the trivial group consisting only of the identity mapping 
on (i. 

To apply these results to the characterization problem at hand, let 
X = (X-j I • ■ • I X|^) be a random n x N matrix having one of a given family of 
normal distributions and let denote the i;^ row of X. We think of 

X^ , • • ' , Xj^ as being the observed random vector, but at various times wish to 
consider the parameters 


- E(X.) 

= cov(X.,X^) 

= cov(X^^^X^'^^). 

For the open set V of Theorem 1, we take the set of regular points of 

T(x) = (xe, xx^); that is, the set of points x at which T'(x) is surjective 

T 

T'(x) is surjective if the matrix (|-) has rank n + 1, which is almost certainly 
true for any of the probabilities under consideration as s^on as N > n + 1. 

Clearly any of the mappings g^ of lemma 1 is a homeomorphism from V onto itself 
and T is a continuous open mapping on v.Trl will be the given set of nN-variate 
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normal probability measures. The invariant measure X of Corollary 1.2 will 
be that given by u.- = 0, C. . = 0 if i ^ j, C.- • = I . If X Js not already 

I I j • ' n><n ’ 

a member of'W, it may be added without affecting the sufficiency of T for*^. 

According to Corollary 1.2, and lemma 1, T is sufficient for *?Ft if only if 


( 2 . 1 ) 

and 

( 2 . 2 ) 




= u 


(i) 






for all i,-! and u f U = ;N N orthogonal matrices u such that ue = e). 

Now, (2.1) holds if and only if each ^ = X^e^ for some scalar X^. , which 
is equivalent to u.j = •■• = In (2.2) u may be replaced by the larger set 

U' = {N N orthogonal matrices such that ue = +e). Let P = ^ and 
Q = I - P. Then U' is the set of all orthogonal matrices which commute with 
P, and (2.2) states that each commutes with each u e U'. Let w 

be an orthogonal matrix such that 


wPw 


T 


1 

°lx(N-l) 

°(N-l)xl 

°(N-l)x(N-l) 


Then U' is the set of all orthogonal matrices u such that wuw^ cownutes 
with wPw^ and (2.2) holds jff ccnmutes with wuw^ for each u c U' . 

elementary calculations show that wuw^ must be of the form 


where 


T 

wuw 



is (N-1)'' (N-1) orthogonal, and that for some scalars X 


(i.j) 


X 

2 
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wR 





0 

0 



If follows that (2.2) is true ij^ each is a linear combination of 

P and Q. Therefore, (2.2) holds if and only if each has constant 

diagonal elements and constant off diagonal elements, which may depend on i 
and j. Thus, there are matrices A = (a,*j) ^nd B = ^(^j) such that 


cov(X^.k.X.^) = 



That is, 


and 


var(Xk) = A 
cov(Xk.X^) = B 


if k = £ 
if k * I 


for all k 
if k . 


Consequently, A and B are symmetric and we have established 

Theorem 2 : Let X.j, •••, Xj^ be jointly normally distributed n-vectors whose 

joint distribution is a number of a family 'M- Then the mean and scatter matrix 
of the X^'s are sufficient for 7H if and only if for each member of 7?/, 

(a) the X.'s are identically distributed, and (b) cov(X^. ,Xj) is independent 
of i and j. 

4 . Co ne 1 us i or : 

As we mentioned in Section 1 if one thinks of a patch as an app*"oximation 
to a field then it is difficult to understand how the within-patch covariance of 
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spoct^al measurements from a nivc?n patch could be constant but dependent on the 
patch size as in (c"). According to the results of Section 3, there is no more 
sophisticated covariance model whose parameters can be estimated with optimum 
efficiency using only the pa<-ch means and scatters; however, there may be 
more realistic covariance models which are well .jpp/*oximated by (o') or (c‘). 

For example, suppose that a natch is rectangular in shape with multidimensional 
spectral information = l*-*r; j = l*-»c) where i and j denote the 

spatial line and column number of the pixel producing X^.j. Suppose further that 
the correlation of two observations X.^ and decays exponentially with 

their spati'l separation; that is, 

cov(X.^.,X^y) = 

where n is their common variance matrix and A and B are symmetric commutinq 
matrices of spectral radius less than 1. Let Z be the average covariance over 
all pairs of distinct pixels. Then a simple calculation shows that for large 
r and s (large patcf size) rs>; is nearly A(I-A)"^B(I-B)'^f2‘*, so 
that T is nearly inversely proportional to the patch size, as is required by (o"). 
If A and B are positive semidefinite, so that is always positively 

correlated with z^Xj^^ for any z, then the expression just given is an upper bound 
for the average within-patch covariance for any patch size. Therefore, the effect 
of approximating the exponenti.il covariance model with the constant covariance 
model (c“) may be predictable, and not serious. 
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Abstract 


This report discusses the Akaike Information Criterion (AIC) 
with special emphasis on the application of the AIC to mixture 
models. The theory and applications of the AIC are discussed. 
Mixture model simulations and the application of the AIC to a 
large portion of a Landsat segment are presented. 


Table of Contents 


Section Page 

1. Introduction 1 

2. Description of the AIC 1 

3. Derivation of the AIC 6 

4. Applications . 8 

5. Mixture Models 14 

6. Mixture Simulations 17 

7. Concluding Remarks 24 

8. References 26 

Appendix 

A. Proofs of the approximations provided 

in section 3 A1 

B. Figures for section 6 B1 


1. INTRODUCTION 


Estimation of parameters in a statistical model is a familiar 
and well discussed topic, but a more important topic, and cer- 
tainly a more difficult one, is the selection of the appropriate 
model. The AIC (AXaike's Information Criterion) is a useful 
tool in model selection. It is particularly important in selec- 
ting the order of the model or in selecting the number of free 
parameters in a model. This report discusses the use of the AIC 
in selecting tne order of a model and we emphasize the use of 
the AIC in de lermir.ing the number of components in a mixture model. 

After introducing appropriate notation in section 2, we show 
that the AIC is an extension of the maximum likelihood principle, 
as well as an entropy maximization principle. Section 3 will 
discuss the derivation of the AIC. In section 4, we give appli- 
cations of the AIC to model selection in a number of important 
problems and we also introduce the BIC which is discussed in 
Hannan [1980]. Section 5 will discuss the application of the 
AIC to the mixture models. In section 6, we will look at the 
effectiveness of the AIC in dealing with the order selection 
problem on some 1 and 2 dimensional mixture problems. 

2. DESCRIPTIO.M OF THE AIC 

We want to introduce the AIC as an extension of the maximum 
likelihood method. Let's begin by explaining why such an exten- 
sion is needed. If we consider the case where tlio order of a 
particular model is determined, then the maximum likelihood 


i 


method is an excellent method for obtaining an estimate for the 
unknovm parameters. The maximum likelihood estimate, under weak 
assumptions, is a strongly consistent, asymptotically unbiased 
and an asymptotically minimum variance estimator of the unknown 
pareuneters (Zacks [1971]). However, in the case that the order 
of the model is not known, the maximum likelihood estimator no 
longer has all of these desirable properties. The cause of this 
difficulty is that the maximum likelihood estimator has a prefer- 
ence for modexs of high order. As the order of the model is 
increased, the value of the maximum likelihood function, evaluated 
at the maximum likelihood estimate for that order model, is 
increased. Therefore, the maximum Ij celihood estimator will 
always have too many parameters. 

The use of the maximum likelihood estimator to estimate the 
order of the model will lead to an estimate which fits the data 
very well (in fact too weM) , but will be a very poor estimator 
of the true density function. In section 4, we will use histograms 
as a concrete example of this problem. 

As a possible replacement for the maximum likelihood estima- 
tor, we wish to consider an entropy maximization principle. This 
approach to the AIC was introduced and developed by 
Akaike [1972b, 1973, 1977] . It has also been supported by a 
Bayseian approach in Akaike 11978, 1979, 1980]. 

Let X be a random variable with density function g(x). If 
f(x) is any other density function, we can define a measure of 
similarity of f and g by 

B(g,f) I log 


2 



ORIGINAL PAGE E3 
OF POOR QUALITY 


f f v\ 

which equals /{log g(x)dx, in the case of a continuous random 


P £i 

varieUDle x, and which equals £ log( — ) g. , in the discrete case. 

i«l ^i ^ 


This measure is the entropy of f and g as defined by Kullback 
[1968]. It is non-positive and equals zero only in the case that 
f = g almost everywhere. 

One interpretation of this, in the discrete case, is that for 
a sample of size N, the quantity N • B(g;f) is approximately the 
logarithm of the probability of obtaining the data distribution 
g(x) from the assumed model f(xj. 

Let f, a model for the data, be defined by 


f(x) = 


Oi < X < 

otherwise 


i = 1, • • • ,p 


where ( £ f.) (a = 1. Given that we have N independent 

i=l ^ ^ 

observations x^, X2 ,***,x^ from f we define i = l,***,p to be 
the frequency of observations in the interval < x < and 

define relative frequencies g^ i = l,***,p by g^ = N^^/N. 

The probability of observing the frequencies {N^, 


from the model f is 
N! 


w = 


N^l N2! 


N N 

f f 

N ! ^1 ^2 


N 


From this we see that 


P P 

log w = log N! - £ log N. ! + £ N. log f . 

i=l ^ i=l ^ ^ 

and usi.ig the fact that log N! ^ n log n - n, then 

P P P 

log w * N log N ” N - £ N. log N + £ N. + £ N. log f. 

i«l ^ ^ i»l ^ i=l ^ ^ 
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p N. N. p N. 

- N Z ^ log ^ + N I ^ log f. 

1*1 1*1 
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i-1 " 

« N • B(g;f). 

So, N * B(g;£) is approximately the logarithm of the probability 
of obtaining the distribution g from the assumed model f. 

From a statistical inference point of view, we wish to find 
a model f (y) which will maximize the expected entropy 

B(g;f(',6(x))) * log{li|^i|^} 

and this idea is well motivated by example 1. 

A 

Let S.(6) denote the value of the log likelihood function, 
evaluated at the maximum likelihood estimate 6 , 

1 N 

£(6) = i I log f(xj^,e(x^,-",x^)) 
k “ 1 

and let k denote the number of free parameters in the model. We 
define the AIC function by 

AIC = -2)1(6) + 

The factor of -2 is introduced for convenience, since in the normal 
case 


-2 log exp( ~^^ 

2o^ 



2 


Observe, that 

" ^y ^ 

where c is independent of the choice of f. Also, it can be shown 
in many cases, that for 6 the maximum likelihood estimate of 6 , we 
have 
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( 2 ) 


Ex Ey log f(y,e) = U(6) - K/N). 

By using (1) and (2) , we see that an estimator which minimizes 
the AIC, should approximately maximize the entropy function. The 
AlC estimate is a choice of parameters (which includes the choice 
of the number of parameter) which minimizes the AIC. 

There is a relationship between the AIC and certain classical 

A A 

hypothesis tests. Let ^^^6 and + the maximum likelihood 

estimates for the m and m -f k order models. Under certain 
regularity conditions we have that - ^(^^0) is asymptotically 

X (k) (Rao [1973]). In the case that this holds, one can apply 
the Neyman-Pearson likelihood ratio test, and for a particular 
level of significance this would be equivalent to use of the AIC. 

The Neyman-Pearson theory is designed to handle a particular 
type of coiuposite hypothesis test and is not applicable to a variety 
of situations. In contrast to this, the AIC has wide applicaJaility . 
It can be applied in the comparison of different types of models 
and can be applied when the Fisher Information matrix is singular. 

We consider the AIC as a simplification of the usual hypothesis 
testing approach to model building. 

A second way to think about the AIC is as a penalized likeli- 
hood estimate. There have been a number of penalized approaches. 
Good and Gaskins [1971] and later Tapia [1978] introduced 
roughness penalties for estimators in infinite dimensional spaces. 
Redner [1980] and Rossback and Lennington [1978] discuss penalties 
on mixture models. 

The AIC approach to the penalty term is appealing because 
of the simplicity and generality of the approach. However, the 
approximation in (2) is not valid for mixture models and we need 
to investigate this problem. 
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3. DERIVATION OP THE AIC 

Let g(x) denote the true density function and let X2 »***, 
Xjj be an independent sample from g(x). Let f(x,6) be our model 
for g(x) amd let 6 q be the value of the parameter 6 such that 
f(x, 6 q) > g(x). We will let i(6) denote the log likelihood 
function and 6 the maximum likelihood estimate. Finally we define 

• h 

as the Fishei. information matrix and 

as the negative of the expected value of the Hessian of the log 
likelihood function. 

If we asstime that K9 q) is a nonsingular matrix, then under 
certain regularity conditions we observe that I(6 q) * 
furthermore that 

i(§) tr(J(0Q) i'^(6q)) 

is approximately an unbiased estimator for E^ liQ^) . It also 
follows that 

I (9) ~ tr(J(0Q) i‘^(0q) ) 

is approximately an unbiased estimator for the entropy. (See 
appendix 1.) 

Since J(0 q) * I(0q), then k = trCJCe^) i“^(0q)) is the number 
of parameters in the model and so 

E* U6(,) . E^ US) - ^ 

and 

-2Ey log f(y, 9 (x) ) « -2 M0) + 
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We have made two critical assumptions in these calculations 
and they are both quite significant. First, we have assumed that 
the Fisher information matrix is nonsingular. In the case that 
it is singular, then the statements above are no longer valid. 

What we can say is that 

Ex IOq) = Ue) rank (1(6^)) 
and -2Ey log f(y,6(x)) a -2E^^ f(6) + ^ rank (I(8 q)). 

These facts will become important when we discuss finite 
mixtures, since in that case, the Fisher information matrix is 
often singular. 

Now suppose that the true distribution of the data is not 
in the model. We distinguish between two different cases. Observe 
that the true distribution cannot be modelled if we use too few 
parameters, but this is not a problem. The maximum likelihood 
estimator rarely chooses a model with too few parameters, for these 
models do not fit the data. The problem is with eliminating models 
with too many parameters. But here we are concerned with models 
which do not contain the true distribution for any number of para- 
meters. This is a case in which we almost always find ourselves. 

The model seldom (if ever) exactly fits the data. But this does 
not invalidate the use of the AIC. 

Let us define a pareuneter to be a choice of 6 which maximizes 
E^ log f(xl6). When the true distribution g(x) is in the model, 
then this implies f(x,6Q) « g(x). Here we have only that 
log f(x,0n) ■ max E f(xl6). 

X U g X 

We then have that 
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log f(x!0Q) a £(6) tr(j"^(0jj) IOq)) 

and so E^ log f(yl0(x)) = E^^ £(0) trCJ'^Ce^) Ke^j)). 

If the true distribution g(x) is close to f(x|0Q), then J(6 q) 
will be close to 1(0^), in which case, trCj'^O^) I(6 q)) = k. 

In this case, we use the AIC and expect the AIC to choose a good 
estimator of 9 q. 

If the true distribution is not well modelled by f(x[0), 
then we should choose a better model, rather than alter the AIC. 

4 . APPLICATIONS 

The simplicity and generality of the AIC procedure maJtes it 
particularly useful. There are many areas in which the AIC 
be used. We want to discuss a very simple application of the AIC 
to demonstrate its use, and hopefully, in this simple environment, 
we can better understand how the AIC functions and perhaps judge 
its effectiveness. 

We consider the problem of determining the bin size for a 
histogram of univariate data, and we recall, that for a fixed 
bin size, the traditional height of the bins is a maximum likeli- 
hood estimate. Given independent observations x^,***,Xjj, and an 
interval of numbers [a,b], we want to find the choice of bin 
size which minimizes the AIC. If the interval is divided into p 
bins and i » l,***,p is the number ox observations in each 
bin, then the AIC equals (except for an additive constant) 

P 

-2( Z N. log N. - N log p) + 2 (p - 1) . 
i»l ^ ^ 
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In some examplett involving normally distributed observations, 
we observe that the AIC chooses a conservative number of bins. 

In 5 exeunples, using 100 independent normally distributed observa- 
tions for each case, the AIC chose from 5 to 8 bins. This is a 
reasoneJDle , if perhaps conservative, number of bins. We can see 
from the examples in Figure 1, that the AIC provides a number of 
bins which gives a smooth histogram. A larger number of bins could 
be used, but not too much larger. Figure 1 is typical of other 
examples. 

When the sample size is increased, the number of bins chosen 
by the AIC is also increased. In each case, the resulting histo- 
gram is a smooth histogram (see Figure 2) . 

That the number of bins increases with the sample size, at 
first appears to be a problem. How can we choose the order of 
the model, if the AIC chooses to use more and more parameters as 
the sample size increases. 

To understand this situation, it is necessary to distinguish 
two cases. The first case is that the true density is in the 
model for some finite nvimber of parameters. The histogram is an 
example of a second type, where the true density is not in the 
model. The normal density function cannot be expressed as a 
histogr 2 Un using a finite number of bins. Therefore, for any 
finite sample, the AIC chooses a small number of bins relative 
to the sample size. As the seunple is increased, the AIC can 
reject histograms which have only a small number of bins and uses 
histograms with smaller mesh size. 
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Figure 1. Graphs of histograms of normal data 
using different bin sizes. 
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Figure 2. 


Graphs of histograms of normal data 
using AIC and different sample sizes. 
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The AlC tends to choose the correct number of parameters 
when the model matches the data. When the model does not match 
the data, the AIC chooses a reasonable number of parameters to 
fit the model, and this number increases with sample size. 

We will return to this topic again when we discuss mixture 
models. But now let's consider other applications of the AIC. 

The theoretical arguments, might give the false impression that 
the AIC can only be used in a maximum likelihood density estima- 
tion setting, and we should dispense with that misconception at 
this time. While it is true that our theoretical justification 
of the AIC is based on the maximum likelihood density estimation 
situation, the AIC can be applied (with prudence) to many situa- 
tions involving observations. 

One of the first applications of the AIC was to time series 
analysis. Hipel [1981] gives a good set of references in this 
area and we will reproduce many of them here and add a few new 
entries. Autoregressive Moving Average (ARMA) process applica- 
tions of the AIC have been presented by Akaike [1974], Ozaki 
[1977], and Lennox, McLeod and Hipel [1977a, b] . For the ARIMA 
process, see Ozake [1977]. Applications to the Autoregressive 
process were given by Akaike [1979] and Jones [1974]. Finally 
Kitagawa [1980] has applied the AIC to the difficult problem of 
modelling a time series which possesses a slowly changing spectrum. 

There has been some recent work by Hannan [1980] on the 
estimation of the order of an ARMA process, in this paper, Hannan 
points out that the AIC is not a strongly consistent estimator of 
the r *der of an ARMA process. In fact it is not even a weakly 


I 

? 

I 
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consistent estimator of the order. He defines two other criterion 
for estimating the number of parameters in an ARMA process. He 
considers the following measures. If p and g are the nuniber of 
parameters in the AR and MA parts of the model, then 
AIC * 2t(e) + 2(p + q) 

BIC « 2£(6) + (p + q) In N 

and 

0(p,q) « 2i(9) + (p + q) C ln(ln N) C > 2. 

Hanneun shows that BIC and (l(p,q) are strongly consistent. 

While these are valuable theorems, they need to be properly 
understood. In the case that the model does not fit the true 
distribution exactly, the notion of 'correct order' becomes 
meaningless. What we want is a parsimonious use of parameters to 
obtain a reasonable fit to the data. This is our overall goal in 
many statistical settings. The AIC seems to perform very well in 
this environment and so should not be ruled out just because of 
these negative results. On the other hand, the AIC may provide 
too many parameters in large sample cases. 

In Akaike [1973], the author considers factor analysis, 
principal component analysis, analysis of variance, and multiple 
regression, as other possible areas of application of the AIC. 
Kitagawa [1979] has used the AIC to detect outliers. Finally, 
choosing the order of a polynomial regression has been considered 
by Akaike [1972a] and Tanabe [1974]. 

The overall simplicity of the AIC makes it a valuable tool 
in model selection and helps integrate the model selection process 
into the estimation process. Because of these facts and the 
successful experience of many statistical investigators, the AIC 
appears to have a bright and useful future as a model selection tool. 
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5. MIXTURE MODELS 


We will now focus our attention on the problem of estimating 
the number of classes in a mixture model. Let X^,***, ^ 
independent identically distributed observations from a mixture 
density. That is, X^ has density function 


0 



and P^(x|6^) is a density function parameterized by 6^ e fi. 

From the independent sample wish to estimate 

the number of classes m and 0 = ia,***, a , S,,***, 6«). 

m 1 m 

Given a fixed value of m, we can obtain maximum likelihood 
estimates of the remaining parameters in the model using the maximum 
likelihood approach. We will not discuss this optimization pro- 
blem here, but the reader is referred to a discussion of the EM 
algorithm by Redner £l9803 

Once we have obtained a maximum likelihood estimate for two 

or more classes, we select as our estimate of the number of classes/ 

a choice of m which minimizes 

AlC(m) = -2)l(^§) + 2 k 
m m 

where k is the number of free parameters in the model with m classes, 
m 

The application of the AIC to mixture models is not as straight- 
forward as many other applications. The difficulties arise in 
several ways and we will now discuss them. 
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Let us suppose that the true n^del has classes. That is« 
the component density functions are identifiable in the sense 
of Teicher (1963], and they have a positive probability of being 
observed in an independent sample. Under these conditions it is 
reasonable to assume that the Fisher information matrix is non- 
singular. Although this is not always the case, it is usually 
satisfied. Now let us consider the rank of the Fisher information 
matrix for the model with m® + 1 classes, given that the true 
distribution has exactly m^ classes . 

What is the rank of the Fisher information matrix? Unfortu- 
nately that is not a well defined question, for its solution 
depends on how the m order model is embedded into the m + 1 
order model. 

Consider two alternative methods of embedding a one class model 
into a two class model and let 6 ^^ and ©2 one-dimensional para- 
meters. The first alternative is that 6 ^ equals 02 , and and 012 
arbitrary. In this Cv.se, the rank of the Fisher information matrix 
is 1. On the other hand, if we use another embedding, say 02 * 0 
and 62 is arbitrary, then the rank of I( 6 q) is 2. In either case, 
we are estimating 3 free parameters. 

This problem is compounded by the fact, that in practice we do 
not know the true order of the model, and the poss.ibility exists 
that the I(0q) has full rank. In this case, it would be rank 3, if 
the two class model were actually correct. 

Fortunately we cem use the AIC as it stands and not worry 
about the rank of K9 q)* Since its rank is not larger than the 
number of free parameters (since the number of free parameters is 
the dimension of the matrix) we use the AIC as stated. 
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We minimize 

AlC(ra) « -21 (0) + 2 k 

m 

where is the number of free parameters in the m class model. 

He propose to use the AIC even in the case that true distri- 
bution is not in the model. He have seen with histograms that 
the AIC is still an effective tool in selecting the order of a 
model and small changes in the data from the model should not 
have a strong effect on the performance of the AIC. 

He will discuss some simulations that were performed using 
the AIC in the next section, but now we wish to consider other 
uses of the AIC in mixture problems. Let us consider the mixture 
of several multivariate normal densities. That is each n- 
dimensional observation has distribution 
m 

P (x) = I a. P. (x, p. , . ) 

i=l ^ ^ ^ ^ 


where P^(x, 







m 

and I a. = 1 and a. >0 i=l,***,m. 

i=l ^ ^ 

The parameters in the model are p^^, ^i^i=l 

number of mixing components m. We may add the additional assump- 
tion that, although the covariances are unknown, they are assumed 
to be equal. 

Choosing between the free convariance model, and the unknown 
but equal covariance model, is a problem in choosing the order of 
the model. The number of parameters to be estimated in estimating 
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the covariance for an n-dimenslonal multivariate normal density is 
. We observe that the constraint requiring the covariance 
to be positive definite does not affect the number of free para* 
metersy since given any positive definite matrix, all the parameters 
can be varied arbitrarily by some small amount without change to 
the positive definiteness. 

If we have a model with m classes and n dimensions, then the 
number of free parameters in the m covariances is k = 

unless the covariances are assumed to be equal. Then we have that 

1 , _ n(n + 1) 

K - 2 • 

The AIC can be used to determine which of these two models 
should be used on any given data set. 

6. MIXTURE SIMULATIONS 

In order to understand the application of the AIC to mixture 
models, we have performed several simulations. The simulations in 
one dimension were designed to analyze the performance of the AIC 
as a function of class separation. The simulations were performed 
with relatively small data sets and we have observed that this 
causes the AIC to choose a conservative number of classes. This 
is similar to what was discovered in the histogram application. 

The data which we generated was a mixture of two normal 
densities. The mixing proportions were equal and the true covari- 
ances were set equal to one. The only parameters which we varied 
were the sample size and the mean values. All of the parameters 
in the model were estimated. The covariances were estimated 
using the assumption they were equal but unknown. The tabulated 
results are in Figure 3 (all of the figures for this section are 
in appendix B) . 
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From these data sets, we observe that the perfomiance of the 
AIC is indeed a function of the class separation. When the classes 
are well separated (mean values 3 stand;^rd deviations apart) the 
AIC unerringly chose the correct number of classes. On the other 
hand, the AIC always chose the one class model when the class 
separations were small (mean values were less than or equal to 1 
standard deviation apart) . The AIC performed well for 2 units of 
separation. 

To test the limits of the AIC, we considered another simula- 
tion. This simulation was composed of 10 repetitions of the one 
standard deviation separation with 300 observations. The results 
are contained in Table 1. We can observe two things in this 
table. Obviously the AiC consistently chose too fev lasses for 
the model. In fact, 8 out of 10 times too few classes were 
chosen, and the c -.ree class model was never chosen. The histo- 
grams and estimated density functions for the first two runs are 
in Figure 4. 

One can estimate the performance of the AIC for larger sample 
sizes by considering Table 1. It appears that for sample sizes 
in the range of 2000 data points the AIC would choose the correct 
model about one-half of the time. 

Our final calculations in one dimension involve Landsat data 
from one scan line of segment 1618. We use the AIC to estimate 
the number of classes in the model and the resulting answer was 
three classes. On consulting ground truth, we see that 88 percent 
of the data lie in three ground truth classes, and no other ground 
truth class comprised more than 5 percent of the scan line. The 
results for channel three and the brightness component are pre- 
sented in Figures 5 and 6. 
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Table 1. AIC values for 10 simulations. 


No. OF Classes 

Run Number 


1 

911.0 

91^.2 

918.0 

2 

* 918.0 

915.1 

918.0 

3 

92i|.8 

927.0 

931.0 


931.^ 

935.4 

939.2 

5 

* 89^1.6 

891.0 

892.2 

6 

890.6 

894.6 

898.6 

7 

933.2 

936.0 

938.8 

8 

92^.2 

927.2 

931.0 

9 

886.6 

888.4 

892.2 

10 

907.0 

911.0 

914.8 


The overall effect of these data sets is to show that the 
AIC chooses a reasonable number of classes, considering the sample 
si^e . 

/.s a* second stage in our investigation of the application of 
the AIC to mixtures, we investigated mixtures of bivariate normal 
density functions. These investigations consisted of simulations 
with three and five bivariate normal densities at several different 
separations, and also, the estimation of the number of normal 
classes in a portion of Landsat segment. 

We initially investigated a mixture of three normal classes 
at various separation. For simplicity of the example, we generated 
classes which had equal probabilities and which each had the identity 
matrix as its covariance matrix. As in the one-dimensional examples, 
we varied the means in the simulations and estimated the covariances 
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under the hypothesis that they were equal but unknown. In Figures 
It 8 and 9, the true values of the means, the values of the AIC 
for the different models, and a graph of the true density function 
are displayed. From the table of AIC values in each of the tnree 
examples, one can see that the AIC chose the correct number of 
classes except in Case III where the true mean separation was 
small and Case IC. Although the classes are well separated in 
Case IC the AIC chose the fourth order model. This type of error 
is to be expected to happen a certain percentage of the time. 

We extended our investigation to mixtures of five normal 
classes. Again the proportions of the classes were equal and the 
covariance of the five classes were equal to the identity matrix. 

In Figures 10 and 11, we see the true mean values, the values of 
the AIC for the different models and graphs of the true density 
function. One thousand sample points were used since we must 
estimate 17 parameters under the assumption that the covariances 
are unknown but are equal. Again we see that the AIC chose the 
correct number of classes when the populations are well separated 
and chose too few classes in the case that there is considerable 
overlap between classes. 

Let us now consider a more realistic data set. We take a 
Landsat segment for this purpose and we selected segment 1633 for 
this test. Using ground truth data, we selected the pure elements 
of the scene. We define a pixel to be a pure pixel, if it has 
neighbors which are all of the same ground truth class. The data 
set was reduced in dimension by the use of the Kauth transformation 
to the greeness and brightness plane. We used the first 20 percent 
of this two dimensional data set to form our working data set. 
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We modelled this data set, which comprises 1966 data points, 
by a mixture of bivariate normal density functions. During some 
of our calculations of the maximum likelihood estimate, we 
observed that the maximum likelihood iteration was proceeding to 
a singularity of the likelihood function. This not only causes 
numerical problems but completely invalidates the use of the AlC 
to determine the number of classes. Singularities of the likeli- 
hood function must be avoided. There are several methods for 
avoiding the singularities of the likelihood function. The 
method which we implemented was an application of a penalty term. 
This penalty term forces the likelihood Iteration to avoid the 
singularities of the likelihood function (see Redner (1980)). 

This type of adjustment to the natural maximum likelihood 
iteration is often necessary. Since the data is discrete, that 
is the data takes on only integer values, this is a common problem. 

with this s<3justment, the maximum likelihood estimate for 
various numbers of classes was calculated. Table 2 contains the 
AIC values. It was not possible to completely determine the 
correct number of classes using the AIC due to limits on machine 
time. But the reader can observe from these numbers that the 
number of classes is quite large. With 15 classes in the model, 
the maxinum likelihood estimate for each parameter is based on 
approximately 22 data points. Furthermore it appears that the 
true number of classes, as determined by the AIC, might be con- 
siderably larger than 15. This is unacceptable for the type of 
application for which this is intended. One would expect that the 
number of o'* asses chosen by the .MC for a full Landsat segment 
would be much larger (perh&ps two to four times as large) . 
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Table 2. AIC values for Landsat Segment 


OF Classes 

AIC Values 

BIC Values 

7 

3319A 

33A23 

8 

33130 

33392 

9 

33126 

33422 

10 

33078 

33407 

11 

33078 

33441 

15 

3295^1 

33511 


This brings us back to our previous remarks concerning the 
consistency of the AIC and the tendency for the AIC to choose a 
large number of classes/ when the true model is not a good approxi* 
mation to the data. Because of the poor showing of the AIC in this 
example we extended the experiment to consider the BIC. The results 
of these calculations are also in Table 2. From these numbers, one 
can see that the. trend of the AIC to choose a large number of classes, 
is not reflected in this table of BIC values. In fact, the choice 
of 8 classes by the BIC, appears to be a much better value for the 
types of applications for which the model is intended. Figure 12 
contains the maximum likelihood estimates for the model with 8 
classes and also contains the scatter plot of the 1966 data points. 

In response to the poor showing of the AIC, we considered one 
final experiment. This estimate of the parameters and the number of 
classes proceeded in a three step process. 

First the maximum likelihood estimate for models with different 
numbers of classes was calculated under the common covariance assump- 
tion. The maximization of the likelihood function with the equal 
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covariance hypothetis ie a very stable optimization problem. It 
also has fewer local maxima and so it is much easier to find the 
global maxima. 

The second phase of the procedure involved determination of 
the number of classes to use in the model using the AIC. In this 
case, the AIC chose 7 classes for the model. Table 3 contains the 
AIC values. 

Since it appears that the common covariance assumption is 
not valid for Landsat data, in the final step we fixed the mean 
values and iterated on the proportions and the covariance. The 
covariances being allowed to vary independently. This provides a 
significant improvement to the fit to the data according to the 
AIC. The new AIC value is 33194. 

Although the AIC has performed well in nunterous applications, 
we observe, in the application of the AIC directly to a large 
portion of a Landsat segment, that it provides a model with a 
large number of classes. The cause of this problem is probably 
the unboundedness of the likelihood function. The use of the 
penalty term was not sufficient to completely rectify this situa* 
tion. We should emphasize that it is the type of application which 

Table 3. AIC values for the equal covariance model 


No. OF Classes AIC 

6 33546 

7 33276 

8 33280 

10 33292 
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vre have in mind and the large amount of ccmi^ station which has to 
be done which causes us to conclude that the answer given by the 
direct application of the AIC, is not suitable. In addition, the 
ratio of the number of parameters being estimated and the number 
of data points available is not particularly large. All of these 
considerations us to consider possible alternatives. The 

most natural alteration of the AIC is the BIC, which gives us a 
more desirable number of classes along with some indication that 
it might consistently give the correct number of classes when the 
model is correct and when we hav3 large data sets. The other 
possibility is given by the three step application of the AIC to 
the mixture problem. This approach is appealing because of the 
stability of the natural fixed point iteration if the covariances 
are assumed to be unknown but equal. 

7. CONCLUDING REMARKS 

The AIC has shown to be effective in a wide range of appli- 
cations. These demonstrations now include the mixture density 
problems. For some data sets it appears that the BIC may provide 
more useful results than the direct application of the AIC. The 
authors are optimistic about the possible uses of the AIC and BIC 
in determing the number of components in a mixture model and in 
determining which of several mixture models to use. 

On the other hand, we do not consider the simulations and 
examples presented in this paper as sufficient proof of the appli 
cability of the AIC or BIC in model selection for the mixture 
problem. In particular the AIC and BIC have not been applied to 
a full Landsat segment and certainly many segments must be con- 
sidered before a judgement on these criterion can be made. 
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It is hoped that experiments along these lines will be carried 
out in the near future. The application of the AIC and BIC to the 
HISSE procedure is a particular experiment that should be carried 
out as well as the application of these criterion to MLE methods 
applied to profile data. 

There are numerous other areas which need to be considered in 
the application of the AIC and BIC to mixture models. The mechanics 
of applying the AIC and BIC to mixture models needs to be considered 
further. Since we are dealing with expensive non«linear optimiza- 
tion problems to obtain the likelihood estimates, we must consider 
the best way to find the AIC or BIC estimates. The suggestion by 
Wolf (1970) may be particularly applicable to this area. Wolf 
suggests the use of certain non-parametric clustering schemes to 
assist in obtaining initial guesses of the MLE. 

Finally we reiterate that the AIC and BIC can be used in a 
wide range of applications. We have emphasized the use of the 
AIC and BIC in model selection for mixture problems because tnat 
is a problem in which we have a deep interest. However, the use 
of the AIC and BIC in other areas should not be neglected and it 
is hoped that the applications which we suggested in section 4 
might lead to other uses of these criterion. 
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APPENDIX A 


Proofs of the approximations provided 
in section 3. 


ORiGR^AL PAGE 13 
OF POOR QUALITY 

Let g (x) denote the true density function for a random variable 

X and let f(x,6) be a model for g(x}. Let 6^ be a choice of para-* 

meters 6 so that 

log f(x,9Q) * max log f(x,6}. 

6 

In the case that the true density g(x) is in the model, then this 
implies that f{x,6Q) = g(x). 

We also define 

to be the Fisher information matrix and define 

e2 

J(e«) = - E { log f(x,6-)} 

^ ^ aeae' 

to be the negative of the expected Hessian of the log likelihood 
function. Let 0 be the maximum likelihood estimate where 6 is 
shorthand for 6 (x^,***, » a function of the observations. 

Finally we define 

f(6) = i I log f(x^,6) 

“ k=l ^ 

to be the log likelihood function. 

Assuming the true density function g(x) to be in the model 
and assuming the necessary regularity conditions, we will show that 

E^iO^) -E^i(e) -^rank (1(6^)) (1) 

and -2 E^ E^ log f(y,6(x)) ' -2 E^^ (0) | rank (1(6^)) (2) 

Later on we will show that, in the case that g(x) is not in 

the model and if J(6 q) and I(6 q) are nonsingular, then we have that 

Exi (6 q) ' (§) -^tr {J(0 q)'^ 

and -2 Ey E^ log f(y,0(x)) ' -2 E^^ il (§) + | tr {J{9 q)'^ I(0q)}. (4) 
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Recall that by the law of large numbers we have 


1 ** 3 ^ 

i I — ^ log f(Xj^,e ) -J(e ). 

” k»l 3636' " 

Using the central limit theorem we can observe that 
1^3 

-i. Z X log f(x. ,6.) -► N(0,I(6.)). 

^ k»l k 0 0 


( 5 ) 


( 6 ) 


From the following equation 

jQ i(6) “ ° ^ ^ f(Xj^,6) 


N 

Z 

k=l 


3 


log 


f(Xk, 6 o) 


fT 


N 

Z 

k*l 


log 


f(x^,e5)(e - e^) 


we calculate, using (5) and (6) , that 
^ J(6 q) (6 - 6q) - N(0,I(6q) ) . 

If J(6 q) (respectively KQq)) is a singular matrix, then we 
have a singular normal density. In this care let P: r”* for 

m > n be a matrix that P*^P is the n x n identity matrix and Pp"^ 
is the projection onto the range rf J(6 q). Then 

PJ(6Q)(e - 6 q) ~ N(0,PI(6q)P^) 

where PI(6 q)p'^ « PJ(6q)P^ is nonsingular. From the definition of P, 

/U PJ(6q) p'^ P(§ - 6q) •> N(0,PI(6q)p’’) 
and so 

rll P(6 - 6) - N(0, (PJ(6 q)p'^)“^(PI(6q)p’’) (PJ(0q)p'^)"^) (7) 

Now observe that 


i Z log f(x ,6.) ~ jj Z log f(x, ,6) 

” k-1 K 0 N K 

-|(6q - §) J(6q) (8q- §) 
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and so by taking expectations 

Ejji(6o) ' EjjM§) - y Ex^®0 “ 

- Ejji{e) - ^ Ejj(6q - 0)’' P^P J(6q) P^P(6q - 0) 

Therefore, 

Ejji(0Q) ' Ejji(0) -^tr 

(PI(0q)P^) (PJ(0q)p’’)"^J (8) 

- Ej^£{0) -^rank (1(0^)) 

since J(6 q) = ^ * 

We see that i{Q) is a biased estimate of ^(Bq) and this bias 
is on the average equal to ^ rank KBq)* 

We now proceed to establishing (2) . First observe that 
Ey log f(y,6) * /log f(yl0) g(y) dy 

- /log f(y,0Q) g(y) dy + /^ {log f(y,0Q)} g(y) dy(B - 6^) 

2 

+ i(0 - 0*)^ / — - — {log f(y,0^)} g(y) dy(0 - 0 q) 

^ ^ 0030' ^ ^ 

- Ey log f (y,6Q) - j(B - 0 q)’‘ J(0q) (6 - 6^) 

- Ey log f(y,0Q) - j(6 - 0^,)^P^PJ(0Q)p'^P(e - 0 q) 

Taking expectations we find that 

Ejj Ey log f(y,0) ' Ej. Ey log f(y»Bo) 

“^tr((PJ(0Q)P^)“^(PI(eQ)p’’)) 

A3 




■ Ey log f(y,0Q) -^rank 1(6^). 

- Ey log f(y,6Q) -^rank 1(6^). 

We observe that equation (1) is equivalent to 
Ey log f(y,0Q) - E^(i(0)) -^rank I(0 q) 
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Putting these two estimates together 

E^ Ey log f(y,0) ^ E^£(6)“^rank I(0 q) 

which establishes (2) . 

Let us now consider the case that g(x) is not in the model 
and so 6 q is a choice of paramete.rs which maximize E^ log f(x,6). 
Let us also assume that the Fisher information matrix K^q) is 
nonsingular. 

Observe that equation (6) still holds and we have immediately 


that 

Ex^Ueo) - - ^tr{(PJ{6Q)P^)'’^(PI(eQ)p'^)} 

If >7(0 q) is nonsingular then 

Ex^(eo) ~ E^M6) -^tr(J(0p)"^I(0Q)) 

and this gives us equation (3). 

We can also observe the (7) holds, that is 
E^ Ey log f{y,6) - E^ Ey log f(y,0Q> 

- ^tr((PJ(0Q)p’')"^(PI(eQ)p’')”^ 


and also 

E^ Ey log f{y,0Q) - E^li§) -^tr((PJ(0Q)p'^)"^(PI{0Q)p'^)"^. 
Combining these facts one gets that 
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Ex Ey log f(y,S) - Ex»(S)4 

Again if J is nonsingular, then P * I and 

and this establishes our final result. 
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Figures from section 6 
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Nd. » Oasks .S 1.0 2.0 3.0 


1 

2S.0 


3S.0 

417.2 

2 

295.4 

S2i.e 

340.0 

412.6 

3 

299.0 

35.4 

343.0 
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190 SATA poimi 


A. 1^-15 

Nd. or GLASsa .S 1.0 2.0 3.0 


1 

273.4 

303.4 

330.8 


2 

277.4 

305.8 


■ii 

3 


307.8 

335.2 

410.6 


300 XATA POINTS 
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No. or Classes .5 1.0 2.0 ^ 


1 

^.2 

88E.6 

1040.6 

1213.2 

2 


88S.4 

1058.0 

1171.8 

3 1 


892.2 

10>C.0 

1175.8 


300 DATA POINTS 




Nd. or Classes .5 1.0 2.0 3.0 


1 

853.2 


1052.6 

1192.6 ! 

2 

897.2 

911.0 

1054.0 

US6.E ! 

3 

908.2 

914.8 

1058.0 

UoO.i^ ' 


Figure 3. Tables of AIC values for 4 simulations. 

Each table contains AIC values for varying 
numbers of classes and class separation. 
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Figure 4 . Graphs of histograms and estimated density functions 
for runs number one and two in table 1. 
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Figure 5. Graphs and AIC values for segment 1618/235 
line 62 channel 3. 
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Figure 6. Graphs and AIC values for segment 

1618/235 line 62 brightness coordinate. 
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Figure 7. Mixture model density and tables of AIC values fcr 
simulations ( * denotes the minimum AIC values) . 
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Figure 8. Mixture model density and table of AIC values. 
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Figure 9. Mixture model density and table of AIC values. 
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Figure 10. Mixture model density and table of AIC values. 


B8 



ORIGINAL FAGH IS 
OF POOR QUALITY 


CASE V TABLE OF AIC VALUES QOOO POINTS) 



True Nean Values 

No. OF Classes 

AIC Values 

Class 1 

1.00, 1.00 

1 

6803.8 

Cuss 2 

3.12, 1.00 

2 

6795.6 0 

Cuss 3 

3.12, 3.12 

3 

6801.6 

Cuss A 

1.00, 3.12 

4 

6794.2 • 

Cuss 5 

2.06, 2.06 

5 

6796.8 0 



6 

6802.0 



tnsc V. fCOCL NIITURC OCNSItT. 


Figure 11. Mixture model density and table of AIC values. 
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Figure 12. Maximum likelihood estimates with 8 
matter plot. 
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